<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
    <id>https://dloranc.github.io</id>
    <title>Dawid Loranc Blog</title>
    <updated>2024-07-06T00:00:00.000Z</updated>
    <generator>https://github.com/jpmonette/feed</generator>
    <link rel="alternate" href="https://dloranc.github.io"/>
    <subtitle>Dawid Loranc Blog</subtitle>
    <icon>https://dloranc.github.io/images/icons/76x76.jpg</icon>
    <entry>
        <title type="html"><![CDATA[New version of the blog]]></title>
        <id>https://dloranc.github.io/2024/07/06/new-version-of-the-blog</id>
        <link href="https://dloranc.github.io/2024/07/06/new-version-of-the-blog"/>
        <updated>2024-07-06T00:00:00.000Z</updated>
        <summary type="html"><![CDATA[I finally got motivated and decided to refresh this blog.]]></summary>
        <content type="html"><![CDATA[<p>I finally got motivated and decided to refresh this blog. The first problem I encountered was that Sculpin and the entire PHP-based environment stopped working. In addition, Travis CI hasn't built the project for a long time. I moved to Docusaurus developed by Meta (Node.js) and I used GitHub Actions to build the project. It took me half a day to transfer old posts, but overall I'm happy with this tool. I know the name implies an emphasis on documentation, but you can pretty quickly build a&nbsp;blog on that. All I had to do was add a search engine and the English language support.</p>
<p>What are my plans for the future? I think I will continue writing about machine learning and AI, but I intend to expand the topic to other things related to broadly understood computer science. I will probably also write in English.</p>
<p>That's it, I hope I will manage to write regularly :)</p>]]></content>
        <category label="Organizational matters" term="Organizational matters"/>
    </entry>
    <entry>
        <title type="html"><![CDATA[Docker on Windows - problems with volumes, symlinks and connectivity]]></title>
        <id>https://dloranc.github.io/2017/08/24/docker-on-windows-problems-with-volumes-symlinks-and-polling</id>
        <link href="https://dloranc.github.io/2017/08/24/docker-on-windows-problems-with-volumes-symlinks-and-polling"/>
        <updated>2017-08-24T00:00:00.000Z</updated>
        <summary type="html"><![CDATA[I've recently started learning Docker (I know, late a bit) and have encountered a few problems with it. Unfortunately, Windows doesn't like Docker very much.]]></summary>
        <content type="html"><![CDATA[<p>I started learning docker a few days ago. So far I have used Vagrant for virtualization, but it didn't sit very well with me. But I don't want to talk about that. I haven't quite grasped docker yet, but after a few attempts and one project, I've found that it's a cool thing to do. However, I'm not going to create any tutorial. If you are interested in one, I invite you to check the <a href="https://docker-curriculum.com/" target="_blank" rel="noopener noreferrer">docker-curriculum.com</a>. It's pretty cool and does a good job of explaining the basics.</p>
<p>While playing around with docker, I encountered a few problems and that's what this post will be about.</p>
<h2 class="anchor anchorWithStickyNavbar_iC02" id="volumes">Volumes<a class="hash-link" aria-label="Direct link to Volumes" title="Direct link to Volumes" href="https://dloranc.github.io/2017/08/24/docker-on-windows-problems-with-volumes-symlinks-and-polling#volumes">​</a></h2>
<p>While trying to work with a simple project, I found that the container could not see the files, even though I had set the appropriate <em>volumes</em> in <code>docker-compose.yml</code>. Trying to fire up <em>bash</em> in the container with something like <code>-v "$PWD/":/var/www/html</code> set also failed. It turned out that <a href="https://docs.docker.com/compose/gettingstarted/#step-6-re-build-and-run-the-app-with-compose" target="_blank" rel="noopener noreferrer">the problem is Windows</a>. Since I can't use <em>Hyper-V</em> because I don't have <em>Windows 10 Professional</em> I have to use <em>Docker Toolbox</em>. After two hours of struggle, it turned out that in docker the files are visible only if they are in any subdirectory <code>c:\Users</code>. Somehow VirtualBox works that way, they made it so. On the regular docker for Windows <a href="https://docs.docker.com/docker-for-windows/#shared-drives" target="_blank" rel="noopener noreferrer">in the documentation</a>, I saw that you can fix the issue in the GUI by making each partition accessible.</p>
<p>One thing puzzles me. Since you're using containers to have an isolated environment, maybe this should somehow apply to file systems? Because that way the isolation is not complete.</p>
<h2 class="anchor anchorWithStickyNavbar_iC02" id="symlinks">Symlinks<a class="hash-link" aria-label="Direct link to Symlinks" title="Direct link to Symlinks" href="https://dloranc.github.io/2017/08/24/docker-on-windows-problems-with-volumes-symlinks-and-polling#symlinks">​</a></h2>
<p>The second issue is also related to Windows. After calling the command in the container:</p>
<div class="language-bash codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-bash codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token function" style="color:#d73a49">npm</span><span class="token plain"> </span><span class="token function" style="color:#d73a49">install</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">&lt;</span><span class="token plain">package_name</span><span class="token operator" style="color:#393A34">&gt;</span><span class="token plain"> --save-dev</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>NPM throws something like:</p>
<div class="language-bash codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-bash codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token function" style="color:#d73a49">npm</span><span class="token plain"> ERR</span><span class="token operator" style="color:#393A34">!</span><span class="token plain"> code EPROTO</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token function" style="color:#d73a49">npm</span><span class="token plain"> ERR</span><span class="token operator" style="color:#393A34">!</span><span class="token plain"> errno </span><span class="token parameter variable" style="color:#36acaa">-71</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token function" style="color:#d73a49">npm</span><span class="token plain"> ERR</span><span class="token operator" style="color:#393A34">!</span><span class="token plain"> syscall symlink</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token function" style="color:#d73a49">npm</span><span class="token plain"> ERR</span><span class="token operator" style="color:#393A34">!</span><span class="token plain"> EPROTO: protocol error, symlink </span><span class="token string" style="color:#e3116c">'../uglify-js/bin/uglifyjs'</span><span class="token plain"> -</span><span class="token operator" style="color:#393A34">&gt;</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">'var/www/html/node_modules/.bin/uglifyjs'</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>Well, yes, it is impossible to do symlinks in a container, because docker does not embrace it, despite the fact that on Windows it is possible to do symlinks (the so-called <code>junction</code>). The solution turned out to be adding the <code>--no-bin-links</code> switch to the command:</p>
<div class="language-bash codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-bash codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token function" style="color:#d73a49">npm</span><span class="token plain"> </span><span class="token function" style="color:#d73a49">install</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">&lt;</span><span class="token plain">package_name</span><span class="token operator" style="color:#393A34">&gt;</span><span class="token plain"> --save-dev --no-bin-links</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>What's worse is that this most likely won't work in other cases, so I'll have to find some other way.</p>
<p>Maybe I'll finally install the Linux? :-)</p>]]></content>
        <category label="Problems" term="Problems"/>
        <category label="Docker" term="Docker"/>
    </entry>
    <entry>
        <title type="html"><![CDATA[Daj Się Poznać 2017 - competition summary]]></title>
        <id>https://dloranc.github.io/2017/05/31/daj-sie-poznac-2017-competition-summary</id>
        <link href="https://dloranc.github.io/2017/05/31/daj-sie-poznac-2017-competition-summary"/>
        <updated>2017-05-31T00:00:00.000Z</updated>
        <summary type="html"><![CDATA[As in the title, it's time to sum up my participation in the Daj Się Poznać 2017 competition. What was successful? Which did not? What's next?]]></summary>
        <content type="html"><![CDATA[<p>Well, I was going to write a post about the project, but I couldn't do it for anything in the world, so I decided to do what the rest of the participants did, that is, sum up the competition and my participation in it. Sorry, but I will be boring :)</p>
<h2 class="anchor anchorWithStickyNavbar_iC02" id="project---development-history">Project - development history<a class="hash-link" aria-label="Direct link to Project - development history" title="Direct link to Project - development history" href="https://dloranc.github.io/2017/05/31/daj-sie-poznac-2017-competition-summary#project---development-history">​</a></h2>
<p>Maybe you don't remember, but as part of the competition, I decided to develop a bot project for Starcraft 2 using reinforcement learning. It wasn't a very smart idea, because there was no API for creating bots at that time, and I naively assumed that Blizzard and DeepMind would be ready for this <em>first quarter of 2017</em>. They didn't make it, the API will be available only in the summer. I didn't know what to do, first I decided to create my own, simplified Starcraft in Phaser (a framework for writing games in JavaScript). I even wrote something there, it is in one repository (<a href="https://github.com/dloranc/simple-rts-and-rl-example" target="_blank" rel="noopener noreferrer">dloranc/simple-rts-and-rl-example</a>), but it quickly became I discouraged. Too much work with this. I would have to write things like path-finding, and I wasn't happy about it. Ultimately, I abandoned this project, but it is possible that I will do something with it in the near future, a simple example with reinforcement learning.</p>
<p>Finally, I decided to take on Starcraft: Brood War, the predecessor of Starcraft 2. I put together quite quickly in Java, using BWAPI and BWMirror, a simple bot performing a very simple strategy known as <em>5 pool</em>. I entered the bot into the SSCAIT tournament/ladder. Surprisingly, he does quite well on ladder considering his simplicity. His results can be viewed on the [results] page(<a href="http://sscaitournament.com/index.php?action=scores" target="_blank" rel="noopener noreferrer">http://sscaitournament.com/index.php?action=scores</a>). This bot is located in the <a href="https://github.com/dloranc/five-pool-bot" target="_blank" rel="noopener noreferrer">dloranc/five-pool-bot</a> repository.</p>
<p>You can see the bot in action here:</p>
<iframe src="https://www.youtube.com/embed/xvI2EuLPg6o" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen=""></iframe>
<p>After writing this bot, I planned to use RL, but it turned out that BWMirror is only 32-bit, so I couldn't use Java libraries because they are only 64-bit. I would have to transfer the bot from BWMirror to something else. I didn't feel like it too much, so I decided to switch to TorchCraft, a library that allows you to develop bots using machine learning. I still struggle with this today and it's not easy because this library has virtually no documentation.</p>
<p>In total, I created four repositories for the project. There were a total of 103 commits, which is rather average. I also regret that I didn't manage to do anything with machine learning.</p>
<p>I will continue to develop all projects, except maybe the one with pool 5, because it is completed and there is not much that can be added there.</p>
<h2 class="anchor anchorWithStickyNavbar_iC02" id="the-rest-of-the-blog">The rest of the blog<a class="hash-link" aria-label="Direct link to The rest of the blog" title="Direct link to The rest of the blog" href="https://dloranc.github.io/2017/05/31/daj-sie-poznac-2017-competition-summary#the-rest-of-the-blog">​</a></h2>
<p>I'm even quite satisfied with the remaining posts. During the competition, I started a series of posts about reinforcement learning based on the book I was reading, Reinforcement Learning: An Introduction (Richard S. Sutton and Andrew G. Barto). All can be found under the tag <a href="https://dloranc.github.io/tags/sutton-and-barto">Sutton &amp; Barto</a> and the repository with examples is <a href="https://github.com/dloranc/reinforcement-learning-an-introduction" target="_blank" rel="noopener noreferrer">dloranc/reinforcement-learning-an-introduction</a>. I'm doing quite well, but I think I'm explaining it all a bit poorly. There is a lot of room for improvement here. The idea to start this series was good, earlier during the competition I had problems with selecting topics. I didn't have any ideas for posts, but now I just need to read the chapters from this book and it's good :)</p>
<h2 class="anchor anchorWithStickyNavbar_iC02" id="what-did-the-competition-give-me">What did the competition give me?<a class="hash-link" aria-label="Direct link to What did the competition give me?" title="Direct link to What did the competition give me?" href="https://dloranc.github.io/2017/05/31/daj-sie-poznac-2017-competition-summary#what-did-the-competition-give-me">​</a></h2>
<p>Where to start... I learned to create posts relatively quickly. No wonder, the people wrote that a person becomes faster in writing by writing and it is true. However, I won't say that my posts are good, because as I wrote above, I have a problem with explaining things simply. I overcomplicate some things. I have a lot of training ahead of me here, I will have to look through my posts and think about what can be written better and simpler.</p>
<p>The second thing is that I managed to write quite regularly. I've always had a problem with this. Here I must admit that I took part in the first edition of "Daj Się Poznać" ("Let yourself be known"), the one in which there were not even a hundred participants. Then I dropped after two posts. It was a tragedy. However, I made it to the end of this year's edition and I consider it a success. Writing and developing the project was incredibly time-consuming, but it worked and I'm happy with it.</p>
<h2 class="anchor anchorWithStickyNavbar_iC02" id="have-i-made-myself-known">Have I made myself known?<a class="hash-link" aria-label="Direct link to Have I made myself known?" title="Direct link to Have I made myself known?" href="https://dloranc.github.io/2017/05/31/daj-sie-poznac-2017-competition-summary#have-i-made-myself-known">​</a></h2>
<p>Heh, not really. It's true that there was some interest in my bot at the beginning and a few people even commented, but nothing has happened on my blog for a long time :) Analytics doesn't show much either. The most sessions I had were 38, about halfway through the competition. I was promoting something on Twitter, but the last month was poor. To tell you the truth, I thought it would be worse.</p>
<h2 class="anchor anchorWithStickyNavbar_iC02" id="acknowledgments">Acknowledgments<a class="hash-link" aria-label="Direct link to Acknowledgments" title="Direct link to Acknowledgments" href="https://dloranc.github.io/2017/05/31/daj-sie-poznac-2017-competition-summary#acknowledgments">​</a></h2>
<p>Finally, I would like to thank the organizer Maciej Aniserowicz for organizing this competition. By the way, I apologize for my RSS :) I would also like to thank the readers of this blog (are there any?) for reading my stuff (comment more often, please!). I would also like to thank the other competition participants for sharing their knowledge and for conversations on Slack.</p>
<p>Thank you very much :)</p>]]></content>
        <category label="Organizational matters" term="Organizational matters"/>
        <category label="DSP 2017" term="DSP 2017"/>
    </entry>
    <entry>
        <title type="html"><![CDATA[Multi-armed bandit - Upper Confidence Bound]]></title>
        <id>https://dloranc.github.io/2017/05/31/multi-armed-bandit-upper-confidence-bound</id>
        <link href="https://dloranc.github.io/2017/05/31/multi-armed-bandit-upper-confidence-bound"/>
        <updated>2017-05-31T00:00:00.000Z</updated>
        <summary type="html"><![CDATA[Continuation of the MAB topic. This time I wrote about a way to optimize exploration.]]></summary>
        <content type="html"><![CDATA[<p><em>This post is part of my struggle with the book <a href="http://incompleteideas.net/sutton/book/the-book-2nd.html" target="_blank" rel="noopener noreferrer">"Reinforcement Learning: An Introduction"</a> by Richard S. Sutton and Andrew G. Barto. Other posts systematizing my knowledge and presenting the code I wrote can be found under the tag <a href="https://dloranc.github.io/tags/sutton-and-barto">Sutton &amp; Barto</a> and in the repository <a href="https://github.com/dloranc/reinforcement-learning-an-introduction" target="_blank" rel="noopener noreferrer">dloranc/reinforcement-learning-an-introduction</a>.</em></p>
<hr>
<p>In <strong>multi-armed bandit</strong> we need exploration to find the best action because the value of each action is uncertain. The value of the action changes when we perform the action from time to time and learn about the reward we receive. The more often a given action is selected, the more certain we are that the value of this action is correct. So far, however, we have not taken this rather intuitive observation into account in our calculations. The actions were selected randomly, without taking into account whether the action values ​​were closest to the best one or how certain the estimates were.</p>
<p>Let's recall how we chose the best action:</p>
<p><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>A</mi><mi>t</mi></msub><mo>≐</mo><mi><munder><mo><mi mathvariant="normal">arg max</mi><mo>⁡</mo></mo><mi>a</mi></munder></mi><mtext> </mtext><msub><mi>Q</mi><mi>t</mi></msub><mo stretchy="false">(</mo><mi>a</mi><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">A_t \doteq \underset{a}{\argmax}\&gt; Q_t(a)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8333em;vertical-align:-0.15em"></span><span class="mord"><span class="mord mathnormal">A</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.2806em"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">t</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2778em"></span><span class="mrel">≐</span><span class="mspace" style="margin-right:0.2778em"></span></span><span class="base"><span class="strut" style="height:1.6444em;vertical-align:-0.8944em"></span><span class="mord"><span class="mop op-limits"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.4306em"><span style="top:-2.2056em;margin-left:0em"><span class="pstrut" style="height:3em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">a</span></span></span></span><span style="top:-3em"><span class="pstrut" style="height:3em"></span><span><span class="mop"><span class="mop"><span class="mord mathrm" style="margin-right:0.01389em">arg</span><span class="mspace" style="margin-right:0.1667em"></span><span class="mord mathrm">max</span></span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.8944em"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em"></span><span class="mord"><span class="mord mathnormal">Q</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.2806em"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">t</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span><span class="mopen">(</span><span class="mord mathnormal">a</span><span class="mclose">)</span></span></span></span></p>
<p>Which corresponds to these lines of code:</p>
<div class="language-Python language-python codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-python codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token plain">argmax </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> np</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">argmax</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">rewards</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">return</span><span class="token plain"> argmax</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>From the <code>choose_action</code> method:</p>
<div class="language-Python language-python codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-python codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token keyword" style="color:#00009f">def</span><span class="token plain"> </span><span class="token function" style="color:#d73a49">choose_action</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">self</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    rand </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> np</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">random</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">uniform</span><span class="token punctuation" style="color:#393A34">(</span><span class="token number" style="color:#36acaa">0</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">1</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> rand </span><span class="token operator" style="color:#393A34">&gt;</span><span class="token plain"> self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">epsilon</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token comment" style="color:#999988;font-style:italic"># exploit</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        argmax </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> np</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">argmax</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">rewards</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token keyword" style="color:#00009f">return</span><span class="token plain"> argmax</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">else</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token comment" style="color:#999988;font-style:italic"># explore</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token keyword" style="color:#00009f">return</span><span class="token plain"> randint</span><span class="token punctuation" style="color:#393A34">(</span><span class="token number" style="color:#36acaa">0</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">arms </span><span class="token operator" style="color:#393A34">-</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">1</span><span class="token punctuation" style="color:#393A34">)</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>Let's use the formula below:</p>
<p><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>A</mi><mi>t</mi></msub><mo>≐</mo><mi><munder><mo><mi mathvariant="normal">arg max</mi><mo>⁡</mo></mo><mi>a</mi></munder></mi><mtext> </mtext><mo fence="false" stretchy="true" minsize="3em" maxsize="3em">[</mo><msub><mi>Q</mi><mi>t</mi></msub><mo stretchy="false">(</mo><mi>a</mi><mo stretchy="false">)</mo><mo>+</mo><mi>c</mi><msqrt><mfrac><mrow><mi>log</mi><mo>⁡</mo><mi>t</mi></mrow><mrow><msub><mi>N</mi><mi>t</mi></msub><mo stretchy="false">(</mo><mi>a</mi><mo stretchy="false">)</mo></mrow></mfrac></msqrt><mtext> </mtext><mo fence="false" stretchy="true" minsize="3em" maxsize="3em">]</mo></mrow><annotation encoding="application/x-tex">A_t \doteq \underset{a}{\argmax}\&gt; \Bigg[Q_t(a) + c \sqrt{\frac{\log{t}}{N_t(a)}}\,\Bigg]</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8333em;vertical-align:-0.15em"></span><span class="mord"><span class="mord mathnormal">A</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.2806em"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">t</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2778em"></span><span class="mrel">≐</span><span class="mspace" style="margin-right:0.2778em"></span></span><span class="base"><span class="strut" style="height:3em;vertical-align:-1.25em"></span><span class="mord"><span class="mop op-limits"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.4306em"><span style="top:-2.2056em;margin-left:0em"><span class="pstrut" style="height:3em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">a</span></span></span></span><span style="top:-3em"><span class="pstrut" style="height:3em"></span><span><span class="mop"><span class="mop"><span class="mord mathrm" style="margin-right:0.01389em">arg</span><span class="mspace" style="margin-right:0.1667em"></span><span class="mord mathrm">max</span></span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.8944em"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em"></span><span class="mord"><span class="delimsizing size4">[</span></span><span class="mord"><span class="mord mathnormal">Q</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.2806em"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">t</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span><span class="mopen">(</span><span class="mord mathnormal">a</span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em"></span></span><span class="base"><span class="strut" style="height:3em;vertical-align:-1.25em"></span><span class="mord mathnormal">c</span><span class="mord sqrt"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.1911em"><span class="svg-align" style="top:-3.8em"><span class="pstrut" style="height:3.8em"></span><span class="mord" style="padding-left:1em"><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.9322em"><span style="top:-2.655em"><span class="pstrut" style="height:3em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.10903em">N</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.2963em"><span style="top:-2.357em;margin-left:-0.109em;margin-right:0.0714em"><span class="pstrut" style="height:2.5em"></span><span class="sizing reset-size3 size1 mtight"><span class="mord mathnormal mtight">t</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.143em"><span></span></span></span></span></span></span><span class="mopen mtight">(</span><span class="mord mathnormal mtight">a</span><span class="mclose mtight">)</span></span></span></span><span style="top:-3.23em"><span class="pstrut" style="height:3em"></span><span class="frac-line" style="border-bottom-width:0.04em"></span></span><span style="top:-3.4461em"><span class="pstrut" style="height:3em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mop mtight"><span class="mtight">l</span><span class="mtight">o</span><span class="mtight" style="margin-right:0.01389em">g</span></span><span class="mspace mtight" style="margin-right:0.1952em"></span><span class="mord mtight"><span class="mord mathnormal mtight">t</span></span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.52em"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span></span></span><span style="top:-3.1511em"><span class="pstrut" style="height:3.8em"></span><span class="hide-tail" style="min-width:1.02em;height:1.88em"><svg xmlns="http://www.w3.org/2000/svg" width="400em" height="1.88em" viewBox="0 0 400000 1944" preserveAspectRatio="xMinYMin slice"><path d="M983 90
l0 -0
c4,-6.7,10,-10,18,-10 H400000v40
H1013.1s-83.4,268,-264.1,840c-180.7,572,-277,876.3,-289,913c-4.7,4.7,-12.7,7,-24,7
s-12,0,-12,0c-1.3,-3.3,-3.7,-11.7,-7,-25c-35.3,-125.3,-106.7,-373.3,-214,-744
c-10,12,-21,25,-33,39s-32,39,-32,39c-6,-5.3,-15,-14,-27,-26s25,-30,25,-30
c26.7,-32.7,52,-63,76,-91s52,-60,52,-60s208,722,208,722
c56,-175.3,126.3,-397.3,211,-666c84.7,-268.7,153.8,-488.2,207.5,-658.5
c53.7,-170.3,84.5,-266.8,92.5,-289.5z
M1001 80h400000v40h-400000z"></path></svg></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.6489em"><span></span></span></span></span></span><span class="mspace" style="margin-right:0.1667em"></span><span class="mord"><span class="delimsizing size4">]</span></span></span></span></span></p>
<p>Where <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>log</mi><mo>⁡</mo><mi>t</mi></mrow><annotation encoding="application/x-tex">\log{t}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8889em;vertical-align:-0.1944em"></span><span class="mop">lo<span style="margin-right:0.01389em">g</span></span><span class="mspace" style="margin-right:0.1667em"></span><span class="mord"><span class="mord mathnormal">t</span></span></span></span></span> is the natural logarithm (i.e. to the base of the <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>e</mi></mrow><annotation encoding="application/x-tex">e</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.4306em"></span><span class="mord mathnormal">e</span></span></span></span> constant), and <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>N</mi><mi>t</mi></msub><mo stretchy="false">(</mo><mi>a</mi><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">N_t(a)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.10903em">N</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.2806em"><span style="top:-2.55em;margin-left:-0.109em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">t</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span><span class="mopen">(</span><span class="mord mathnormal">a</span><span class="mclose">)</span></span></span></span> is the number of <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>a</mi></mrow><annotation encoding="application/x-tex">a</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.4306em"></span><span class="mord mathnormal">a</span></span></span></span> actions performed.</p>
<p>The square root part of the above formula measures the uncertainty in the estimate of the value of the stock <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>a</mi></mrow><annotation encoding="application/x-tex">a</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.4306em"></span><span class="mord mathnormal">a</span></span></span></span>.</p>
<p>The code looks like this:</p>
<div class="language-Python language-python codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-python codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token keyword" style="color:#00009f">def</span><span class="token plain"> </span><span class="token function" style="color:#d73a49">choose_action</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">self</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    rand </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> np</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">random</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">uniform</span><span class="token punctuation" style="color:#393A34">(</span><span class="token number" style="color:#36acaa">0</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">1</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> rand </span><span class="token operator" style="color:#393A34">&gt;</span><span class="token plain"> self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">epsilon</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token comment" style="color:#999988;font-style:italic"># exploit</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        ucb </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">rewards </span><span class="token operator" style="color:#393A34">+</span><span class="token plain"> \</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">c </span><span class="token operator" style="color:#393A34">*</span><span class="token plain"> np</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">sqrt</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">np</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">log</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">t </span><span class="token operator" style="color:#393A34">+</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">1</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">/</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">action_count </span><span class="token operator" style="color:#393A34">+</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">1</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token keyword" style="color:#00009f">return</span><span class="token plain"> np</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">argmax</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">ucb</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">else</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token comment" style="color:#999988;font-style:italic"># explore</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token keyword" style="color:#00009f">return</span><span class="token plain"> randint</span><span class="token punctuation" style="color:#393A34">(</span><span class="token number" style="color:#36acaa">0</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">arms </span><span class="token operator" style="color:#393A34">-</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">1</span><span class="token punctuation" style="color:#393A34">)</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>In the constructor I added <code>c</code> and <code>t</code>:</p>
<div class="language-Python language-python codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-python codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token keyword" style="color:#00009f">def</span><span class="token plain"> </span><span class="token function" style="color:#d73a49">__init__</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">self</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> arms</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> pulls</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> epsilon</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> c</span><span class="token operator" style="color:#393A34">=</span><span class="token number" style="color:#36acaa">0</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">t </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">0</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">c </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> c</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p><code>c</code> is a parameter that controls the degree of exploration.</p>
<p>I generated a table with possible values ​​for some <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>a</mi></mrow><annotation encoding="application/x-tex">a</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.4306em"></span><span class="mord mathnormal">a</span></span></span></span> action:</p>
<table><thead><tr><th><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>t</mi></mrow><annotation encoding="application/x-tex">t</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6151em"></span><span class="mord mathnormal">t</span></span></span></span></th><th><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>N</mi><mi>t</mi></msub></mrow><annotation encoding="application/x-tex">N_t</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8333em;vertical-align:-0.15em"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.10903em">N</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.2806em"><span style="top:-2.55em;margin-left:-0.109em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">t</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span></span></span></span></th><th><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>log</mi><mo>⁡</mo><mi>t</mi></mrow><annotation encoding="application/x-tex">\log{t}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8889em;vertical-align:-0.1944em"></span><span class="mop">lo<span style="margin-right:0.01389em">g</span></span><span class="mspace" style="margin-right:0.1667em"></span><span class="mord"><span class="mord mathnormal">t</span></span></span></span></span></th><th><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>c</mi><msqrt><mfrac><mrow><mi>log</mi><mo>⁡</mo><mi>t</mi></mrow><msub><mi>N</mi><mi>t</mi></msub></mfrac></msqrt><mo separator="true">,</mo><mi>c</mi><mo>=</mo><mn>2</mn></mrow><annotation encoding="application/x-tex">c\sqrt{\frac{\log{t}}{N_t}}, c = 2</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.84em;vertical-align:-0.6114em"></span><span class="mord mathnormal">c</span><span class="mord sqrt"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.2286em"><span class="svg-align" style="top:-3.8em"><span class="pstrut" style="height:3.8em"></span><span class="mord" style="padding-left:1em"><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.9322em"><span style="top:-2.655em"><span class="pstrut" style="height:3em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.10903em">N</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.2963em"><span style="top:-2.357em;margin-left:-0.109em;margin-right:0.0714em"><span class="pstrut" style="height:2.5em"></span><span class="sizing reset-size3 size1 mtight"><span class="mord mathnormal mtight">t</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.143em"><span></span></span></span></span></span></span></span></span></span><span style="top:-3.23em"><span class="pstrut" style="height:3em"></span><span class="frac-line" style="border-bottom-width:0.04em"></span></span><span style="top:-3.4461em"><span class="pstrut" style="height:3em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mop mtight"><span class="mtight">l</span><span class="mtight">o</span><span class="mtight" style="margin-right:0.01389em">g</span></span><span class="mspace mtight" style="margin-right:0.1952em"></span><span class="mord mtight"><span class="mord mathnormal mtight">t</span></span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.4451em"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span></span></span><span style="top:-3.1886em"><span class="pstrut" style="height:3.8em"></span><span class="hide-tail" style="min-width:1.02em;height:1.88em"><svg xmlns="http://www.w3.org/2000/svg" width="400em" height="1.88em" viewBox="0 0 400000 1944" preserveAspectRatio="xMinYMin slice"><path d="M983 90
l0 -0
c4,-6.7,10,-10,18,-10 H400000v40
H1013.1s-83.4,268,-264.1,840c-180.7,572,-277,876.3,-289,913c-4.7,4.7,-12.7,7,-24,7
s-12,0,-12,0c-1.3,-3.3,-3.7,-11.7,-7,-25c-35.3,-125.3,-106.7,-373.3,-214,-744
c-10,12,-21,25,-33,39s-32,39,-32,39c-6,-5.3,-15,-14,-27,-26s25,-30,25,-30
c26.7,-32.7,52,-63,76,-91s52,-60,52,-60s208,722,208,722
c56,-175.3,126.3,-397.3,211,-666c84.7,-268.7,153.8,-488.2,207.5,-658.5
c53.7,-170.3,84.5,-266.8,92.5,-289.5z
M1001 80h400000v40h-400000z"></path></svg></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.6114em"><span></span></span></span></span></span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.1667em"></span><span class="mord mathnormal">c</span><span class="mspace" style="margin-right:0.2778em"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em"></span></span><span class="base"><span class="strut" style="height:0.6444em"></span><span class="mord">2</span></span></span></span></th></tr></thead><tbody><tr><td>1</td><td>1</td><td>0</td><td>0</td></tr><tr><td>2</td><td>1</td><td>0,3010299957</td><td>1,0973240099</td></tr><tr><td>3</td><td>1</td><td>0,4771212547</td><td>1,3814792864</td></tr><tr><td>4</td><td>1</td><td>0,6020599913</td><td>1,5518504971</td></tr><tr><td>5</td><td>1</td><td>0,6989700043</td><td>1,6720885196</td></tr><tr><td>6</td><td>2</td><td>0,7781512504</td><td>1,2475185372</td></tr><tr><td>7</td><td>2</td><td>0,84509804</td><td>1,3000754132</td></tr><tr><td>8</td><td>2</td><td>0,903089987</td><td>1,3439419534</td></tr><tr><td>9</td><td>2</td><td>0,9542425094</td><td>1,3814792864</td></tr><tr><td>10</td><td>2</td><td>1</td><td>1,4142135624</td></tr><tr><td>11</td><td>3</td><td>1,0413926852</td><td>1,1783563044</td></tr><tr><td>12</td><td>3</td><td>1,079181246</td><td>1,1995450505</td></tr><tr><td>13</td><td>3</td><td>1,1139433523</td><td>1,218711534</td></tr><tr><td>14</td><td>3</td><td>1,1461280357</td><td>1,2361920216</td></tr><tr><td>15</td><td>3</td><td>1,1760912591</td><td>1,2522466525</td></tr><tr><td>16</td><td>4</td><td>1,2041199827</td><td>1,0973240099</td></tr><tr><td>17</td><td>4</td><td>1,2304489214</td><td>1,1092560216</td></tr><tr><td>18</td><td>4</td><td>1,2552725051</td><td>1,1203894435</td></tr><tr><td>19</td><td>4</td><td>1,278753601</td><td>1,13081988</td></tr><tr><td>20</td><td>4</td><td>1,3010299957</td><td>1,1406270186</td></tr><tr><td>21</td><td>5</td><td>1,3222192947</td><td>1,0284821028</td></tr><tr><td>22</td><td>5</td><td>1,3424226808</td><td>1,036309869</td></tr><tr><td>23</td><td>5</td><td>1,361727836</td><td>1,0437347694</td></tr><tr><td>24</td><td>5</td><td>1,3802112417</td><td>1,0507944582</td></tr><tr><td>25</td><td>5</td><td>1,3979400087</td><td>1,0575216343</td></tr><tr><td>26</td><td>6</td><td>1,414973348</td><td>0,9712443386</td></tr><tr><td>27</td><td>6</td><td>1,4313637642</td><td>0,9768533715</td></tr><tr><td>28</td><td>6</td><td>1,4471580313</td><td>0,9822280901</td></tr><tr><td>29</td><td>6</td><td>1,4623979979</td><td>0,9873864485</td></tr><tr><td>30</td><td>6</td><td>1,4771212547</td><td>0,9923444478</td></tr><tr><td>31</td><td>7</td><td>1,4913616938</td><td>0,9231504115</td></tr><tr><td>32</td><td>7</td><td>1,5051499783</td><td>0,9274080558</td></tr><tr><td>33</td><td>7</td><td>1,5185139399</td><td>0,9315161036</td></tr><tr><td>34</td><td>7</td><td>1,531478917</td><td>0,9354842648</td></tr><tr><td>35</td><td>7</td><td>1,5440680444</td><td>0,939321349</td></tr><tr><td>36</td><td>8</td><td>1,5563025008</td><td>0,8821288173</td></tr><tr><td>37</td><td>8</td><td>1,5682017241</td><td>0,885494699</td></tr><tr><td>38</td><td>8</td><td>1,5797835966</td><td>0,8887585714</td></tr><tr><td>39</td><td>9</td><td>1,591064607</td><td>0,8409160632</td></tr><tr><td>40</td><td>10</td><td>1,6020599913</td><td>0,8005148322</td></tr><tr><td>41</td><td>11</td><td>1,6127838567</td><td>0,7658112411</td></tr><tr><td>42</td><td>12</td><td>1,6232492904</td><td>0,7355835077</td></tr><tr><td>43</td><td>13</td><td>1,6334684556</td><td>0,70894688</td></tr><tr><td>44</td><td>14</td><td>1,6434526765</td><td>0,6852429551</td></tr><tr><td>45</td><td>15</td><td>1,6532125138</td><td>0,6639703836</td></tr><tr><td>46</td><td>16</td><td>1,6627578317</td><td>0,6447398374</td></tr><tr><td>47</td><td>17</td><td>1,6720978579</td><td>0,6272438044</td></tr><tr><td>48</td><td>18</td><td>1,6812412374</td><td>0,6112357678</td></tr><tr><td>49</td><td>19</td><td>1,69019608</td><td>0,59651551</td></tr><tr><td>50</td><td>20</td><td>1,6989700043</td><td>0,5829185199</td></tr><tr><td>51</td><td>21</td><td>1,7075701761</td><td>0,5703082168</td></tr><tr><td>52</td><td>22</td><td>1,7160033436</td><td>0,5585701459</td></tr><tr><td>53</td><td>23</td><td>1,7242758696</td><td>0,5476075824</td></tr><tr><td>54</td><td>24</td><td>1,7323937598</td><td>0,5373381555</td></tr><tr><td>55</td><td>25</td><td>1,7403626895</td><td>0,5276912263</td></tr><tr><td>56</td><td>26</td><td>1,748188027</td><td>0,5186058273</td></tr><tr><td>57</td><td>27</td><td>1,7558748557</td><td>0,5100290269</td></tr><tr><td>58</td><td>28</td><td>1,7634279936</td><td>0,501914619</td></tr><tr><td>59</td><td>29</td><td>1,7708520116</td><td>0,4942220654</td></tr></tbody></table>
<p>As can be seen from the values ​​in the table, with the passage of time <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>t</mi></mrow><annotation encoding="application/x-tex">t</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6151em"></span><span class="mord mathnormal">t</span></span></span></span>, the uncertainty value in the root generally decreases, but if the action was not selected, the uncertainty increases slightly.</p>
<h2 class="anchor anchorWithStickyNavbar_iC02" id="summary">Summary<a class="hash-link" aria-label="Direct link to Summary" title="Direct link to Summary" href="https://dloranc.github.io/2017/05/31/multi-armed-bandit-upper-confidence-bound#summary">​</a></h2>
<p>This method of selecting shares depending on uncertainty is abbreviated as UCB (upper confidence bound). This is a statistical method related to confidence intervals, or at least that's how I understand it. I barely remember anything about statistics, I was never good at it. UCB is quite a good method, but Sutton &amp; Barto warn that it does not work well in non-stationary problems or in problems in which we are dealing with a large state space.</p>
<p>The code can be seen <a href="https://github.com/dloranc/reinforcement-learning-an-introduction/blob/master/01_multi_arm_bandits/05_ucb.py" target="_blank" rel="noopener noreferrer">here</a>.</p>
<p>Finally, some charts:</p>
<p><img decoding="async" loading="lazy" alt="average rewards" src="https://dloranc.github.io/assets/images/05_average_reward-06639bbed974901c070ae469ef69237b.png" width="960" height="400" class="img_r5VP"></p>
<p>Pretty good, on average UCB is better than the version without it in terms of average rewards. This jump and drop at the beginning of the algorithm is interesting.</p>
<p>It's worse with optimal actions:
<img decoding="async" loading="lazy" alt="optimal actions" src="https://dloranc.github.io/assets/images/05_optimal_action-441d8848403b12694bb03c56968c7ad4.png" width="960" height="400" class="img_r5VP"></p>
<p>No wonder, exploration happens more often.</p>
<p>For one MAB run:
<img decoding="async" loading="lazy" alt="one MAB run" src="https://dloranc.github.io/assets/images/05_rewards-33e6a3291eb632fc5fc690ff0134efbc.png" width="960" height="800" class="img_r5VP"></p>]]></content>
        <category label="Sutton & Barto" term="Sutton & Barto"/>
        <category label="Python" term="Python"/>
        <category label="Multi Armed Bandit" term="Multi Armed Bandit"/>
        <category label="Reinforcement learning" term="Reinforcement learning"/>
        <category label="DSP 2017" term="DSP 2017"/>
    </entry>
    <entry>
        <title type="html"><![CDATA[Multi-armed bandit - optimistic initial values]]></title>
        <id>https://dloranc.github.io/2017/05/28/multi-armed-bandit-optimistic-initial-values</id>
        <link href="https://dloranc.github.io/2017/05/28/multi-armed-bandit-optimistic-initial-values"/>
        <updated>2017-05-28T00:00:00.000Z</updated>
        <summary type="html"><![CDATA[We continue the topic of multi-armed bandit optimization. Another effective and very simple optimization.]]></summary>
        <content type="html"><![CDATA[<p><em>This post is part of my struggle with the book <a href="http://incompleteideas.net/sutton/book/the-book-2nd.html" target="_blank" rel="noopener noreferrer">"Reinforcement Learning: An Introduction"</a> by Richard S. Sutton and Andrew G. Barto. Other posts systematizing my knowledge and presenting the code I wrote can be found under the tag <a href="https://dloranc.github.io/tags/sutton-and-barto">Sutton &amp; Barto</a> and in the repository <a href="https://github.com/dloranc/reinforcement-learning-an-introduction" target="_blank" rel="noopener noreferrer">dloranc/reinforcement-learning-an-introduction</a>.</em></p>
<hr>
<p>All the methods I have described so far depend on initial estimates of the value of <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>Q</mi><mn>1</mn></msub><mo stretchy="false">(</mo><mi>a</mi><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">Q_1(a)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em"></span><span class="mord"><span class="mord mathnormal">Q</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3011em"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">1</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span><span class="mopen">(</span><span class="mord mathnormal">a</span><span class="mclose">)</span></span></span></span>. This is especially visible when we calculate <strong>MAB</strong> with <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>ϵ</mi><mo>=</mo><mn>0</mn></mrow><annotation encoding="application/x-tex">\epsilon = 0</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.4306em"></span><span class="mord mathnormal">ϵ</span><span class="mspace" style="margin-right:0.2778em"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em"></span></span><span class="base"><span class="strut" style="height:0.6444em"></span><span class="mord">0</span></span></span></span>, i.e. without exploration, still selecting the best possible action (arm). In statistics, we call such methods biased. The bias disappears for methods with <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>α</mi></mrow><annotation encoding="application/x-tex">\alpha</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.4306em"></span><span class="mord mathnormal" style="margin-right:0.0037em">α</span></span></span></span> of <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mfrac><mn>1</mn><mi>n</mi></mfrac></mrow><annotation encoding="application/x-tex">\frac{1}{n}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.1901em;vertical-align:-0.345em"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.8451em"><span style="top:-2.655em"><span class="pstrut" style="height:3em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">n</span></span></span></span><span style="top:-3.23em"><span class="pstrut" style="height:3em"></span><span class="frac-line" style="border-bottom-width:0.04em"></span></span><span style="top:-3.394em"><span class="pstrut" style="height:3em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight">1</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.345em"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span></span></span></span> when each action is selected at least once. For constant <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>α</mi></mrow><annotation encoding="application/x-tex">\alpha</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.4306em"></span><span class="mord mathnormal" style="margin-right:0.0037em">α</span></span></span></span>, the bias does not disappear, it only decreases with time (subsequent iterations of the algorithm).</p>
<p>To get rid of the bias, we can use something like optimistic initial values. In the code of the examples I have written so far, the true values ​​of each action came from the normal distribution (mean 0, variance 1) and were set in the constructor:</p>
<div class="language-Python language-python codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-python codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token plain">self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">true_reward </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">[</span><span class="token plain">np</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">random</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">randn</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">for</span><span class="token plain"> _ </span><span class="token keyword" style="color:#00009f">in</span><span class="token plain"> </span><span class="token builtin">range</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">arms</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">]</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>By the way, I would like to remind you that getting the value for a given action looks like this:</p>
<div class="language-Python language-python codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-python codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token keyword" style="color:#00009f">def</span><span class="token plain"> </span><span class="token function" style="color:#d73a49">get_reward</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">self</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> action</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">return</span><span class="token plain"> self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">true_reward</span><span class="token punctuation" style="color:#393A34">[</span><span class="token plain">action</span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">+</span><span class="token plain"> np</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">random</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">randn</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>That's it for the reminder.</p>
<p>We can encourage the algorithm to explore at the beginning of its operation by setting very optimistic initial values. I chose the number 5 and it looks like this:</p>
<div class="language-Python language-python codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-python codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token plain">self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">action_count </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> np</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">ones</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">arms</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">rewards </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> np</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">ones</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">arms</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">*</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">5</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>And that's it, no more modifications to the code are needed. What happens when we run the script with such <code>self.rewards</code> values? Let's see this with an example with epsilon equal to zero:</p>
<div class="language-python codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-python codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token plain">self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">rewards</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token punctuation" style="color:#393A34">[</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">5.</span><span class="token plain">  </span><span class="token number" style="color:#36acaa">5.</span><span class="token plain">  </span><span class="token number" style="color:#36acaa">5.</span><span class="token plain">  </span><span class="token number" style="color:#36acaa">5.</span><span class="token plain">  </span><span class="token number" style="color:#36acaa">5.</span><span class="token plain">  </span><span class="token number" style="color:#36acaa">5.</span><span class="token plain">  </span><span class="token number" style="color:#36acaa">5.</span><span class="token plain">  </span><span class="token number" style="color:#36acaa">5.</span><span class="token plain">  </span><span class="token number" style="color:#36acaa">5.</span><span class="token plain">  </span><span class="token number" style="color:#36acaa">5.</span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">reward</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">1.87411213524</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token punctuation" style="color:#393A34">[</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">3.43705607</span><span class="token plain">  </span><span class="token number" style="color:#36acaa">5.</span><span class="token plain">          </span><span class="token number" style="color:#36acaa">5.</span><span class="token plain">          </span><span class="token number" style="color:#36acaa">5.</span><span class="token plain">          </span><span class="token number" style="color:#36acaa">5.</span><span class="token plain">          </span><span class="token number" style="color:#36acaa">5.</span><span class="token plain">          </span><span class="token number" style="color:#36acaa">5.</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token number" style="color:#36acaa">1.</span><span class="token plain">          </span><span class="token number" style="color:#36acaa">5.</span><span class="token plain">          </span><span class="token number" style="color:#36acaa">5.</span><span class="token plain">        </span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">reward</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">0.504446069392</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token punctuation" style="color:#393A34">[</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">3.43705607</span><span class="token plain">  </span><span class="token number" style="color:#36acaa">2.75222303</span><span class="token plain">  </span><span class="token number" style="color:#36acaa">5.</span><span class="token plain">          </span><span class="token number" style="color:#36acaa">5.</span><span class="token plain">          </span><span class="token number" style="color:#36acaa">5.</span><span class="token plain">          </span><span class="token number" style="color:#36acaa">5.</span><span class="token plain">          </span><span class="token number" style="color:#36acaa">5.</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token number" style="color:#36acaa">1.</span><span class="token plain">          </span><span class="token number" style="color:#36acaa">5.</span><span class="token plain">          </span><span class="token number" style="color:#36acaa">5.</span><span class="token plain">        </span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">reward</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">0.374953664141</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token punctuation" style="color:#393A34">[</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">3.43705607</span><span class="token plain">  </span><span class="token number" style="color:#36acaa">2.75222303</span><span class="token plain">  </span><span class="token number" style="color:#36acaa">2.68747683</span><span class="token plain">  </span><span class="token number" style="color:#36acaa">5.</span><span class="token plain">          </span><span class="token number" style="color:#36acaa">5.</span><span class="token plain">          </span><span class="token number" style="color:#36acaa">5.</span><span class="token plain">          </span><span class="token number" style="color:#36acaa">5.</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token number" style="color:#36acaa">1.</span><span class="token plain">          </span><span class="token number" style="color:#36acaa">5.</span><span class="token plain">          </span><span class="token number" style="color:#36acaa">5.</span><span class="token plain">        </span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">reward</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">-</span><span class="token number" style="color:#36acaa">3.66371477543</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token punctuation" style="color:#393A34">[</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">3.43705607</span><span class="token plain">  </span><span class="token number" style="color:#36acaa">2.75222303</span><span class="token plain">  </span><span class="token number" style="color:#36acaa">2.68747683</span><span class="token plain">  </span><span class="token number" style="color:#36acaa">0.66814261</span><span class="token plain">  </span><span class="token number" style="color:#36acaa">5.</span><span class="token plain">          </span><span class="token number" style="color:#36acaa">5.</span><span class="token plain">          </span><span class="token number" style="color:#36acaa">5.</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token number" style="color:#36acaa">1.</span><span class="token plain">          </span><span class="token number" style="color:#36acaa">5.</span><span class="token plain">          </span><span class="token number" style="color:#36acaa">5.</span><span class="token plain">        </span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">reward</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">-</span><span class="token number" style="color:#36acaa">1.59327852639</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token punctuation" style="color:#393A34">[</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">3.43705607</span><span class="token plain">  </span><span class="token number" style="color:#36acaa">2.75222303</span><span class="token plain">  </span><span class="token number" style="color:#36acaa">2.68747683</span><span class="token plain">  </span><span class="token number" style="color:#36acaa">0.66814261</span><span class="token plain">  </span><span class="token number" style="color:#36acaa">1.70336074</span><span class="token plain">  </span><span class="token number" style="color:#36acaa">5.</span><span class="token plain">          </span><span class="token number" style="color:#36acaa">5.</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token number" style="color:#36acaa">1.</span><span class="token plain">          </span><span class="token number" style="color:#36acaa">5.</span><span class="token plain">          </span><span class="token number" style="color:#36acaa">5.</span><span class="token plain">        </span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">reward</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">-</span><span class="token number" style="color:#36acaa">2.41539982786</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token punctuation" style="color:#393A34">[</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">3.43705607</span><span class="token plain">  </span><span class="token number" style="color:#36acaa">2.75222303</span><span class="token plain">  </span><span class="token number" style="color:#36acaa">2.68747683</span><span class="token plain">  </span><span class="token number" style="color:#36acaa">0.66814261</span><span class="token plain">  </span><span class="token number" style="color:#36acaa">1.70336074</span><span class="token plain">  </span><span class="token number" style="color:#36acaa">1.29230009</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token number" style="color:#36acaa">1.</span><span class="token plain">          </span><span class="token number" style="color:#36acaa">5.</span><span class="token plain">          </span><span class="token number" style="color:#36acaa">5.</span><span class="token plain">          </span><span class="token number" style="color:#36acaa">5.</span><span class="token plain">        </span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">reward</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">-</span><span class="token number" style="color:#36acaa">0.641184621377</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token punctuation" style="color:#393A34">[</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">3.43705607</span><span class="token plain">  </span><span class="token number" style="color:#36acaa">2.75222303</span><span class="token plain">  </span><span class="token number" style="color:#36acaa">2.68747683</span><span class="token plain">  </span><span class="token number" style="color:#36acaa">0.66814261</span><span class="token plain">  </span><span class="token number" style="color:#36acaa">1.70336074</span><span class="token plain">  </span><span class="token number" style="color:#36acaa">1.29230009</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token number" style="color:#36acaa">2.17940769</span><span class="token plain">  </span><span class="token number" style="color:#36acaa">5.</span><span class="token plain">          </span><span class="token number" style="color:#36acaa">5.</span><span class="token plain">          </span><span class="token number" style="color:#36acaa">5.</span><span class="token plain">        </span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">reward</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">-</span><span class="token number" style="color:#36acaa">1.53850026836</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token punctuation" style="color:#393A34">[</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">3.43705607</span><span class="token plain">  </span><span class="token number" style="color:#36acaa">2.75222303</span><span class="token plain">  </span><span class="token number" style="color:#36acaa">2.68747683</span><span class="token plain">  </span><span class="token number" style="color:#36acaa">0.66814261</span><span class="token plain">  </span><span class="token number" style="color:#36acaa">1.70336074</span><span class="token plain">  </span><span class="token number" style="color:#36acaa">1.29230009</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token number" style="color:#36acaa">2.17940769</span><span class="token plain">  </span><span class="token number" style="color:#36acaa">1.73074987</span><span class="token plain">  </span><span class="token number" style="color:#36acaa">5.</span><span class="token plain">          </span><span class="token number" style="color:#36acaa">5.</span><span class="token plain">        </span><span class="token punctuation" style="color:#393A34">]</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>The rewards received are less than five and averaged, so even for an epsilon of zero, exploration occurs and the algorithm tries each action in turn. Let's see it graphically (epsilon equal to zero):</p>
<p><img decoding="async" loading="lazy" alt="optimistic initial values, epsilon = 0" src="https://dloranc.github.io/assets/images/01_rewards-1e67c3006a681c0754f2427d258a4871.png" width="960" height="800" class="img_r5VP"></p>
<p>For comparison, without our optimization (epsilon is also zero):</p>
<p><img decoding="async" loading="lazy" alt="bez optimistic initial values, epsilon = 0" src="https://dloranc.github.io/assets/images/02_rewards-e349e93db420144ff785fc230aa926d6.png" width="960" height="800" class="img_r5VP"></p>
<p>The other charts, those jumps at the beginning are interesting. This is the result of our optimization:</p>
<p><img decoding="async" loading="lazy" alt="średnie nagrody" src="https://dloranc.github.io/assets/images/04_average_reward-d3d3504f7694ea8101f3e1e0d14da576.png" width="960" height="400" class="img_r5VP">
<img decoding="async" loading="lazy" alt="optymalne akcje" src="https://dloranc.github.io/assets/images/04_optimal_action-74cd49f2dd02588788e70330dd06eac5.png" width="960" height="400" class="img_r5VP"></p>
<h2 class="anchor anchorWithStickyNavbar_iC02" id="summary">Summary<a class="hash-link" aria-label="Direct link to Summary" title="Direct link to Summary" href="https://dloranc.github.io/2017/05/28/multi-armed-bandit-optimistic-initial-values#summary">​</a></h2>
<p>As you can see, this is a simple but quite effective way to encourage <strong>MAB</strong> with e-greedy strategy, even as it selects the best actions every time. However, this method only works at the beginning of the algorithm, during the first dozen or so iterations. In the next post I will deal with a way to improve exploration, the so-called Upper-Confidence-Bound method.</p>]]></content>
        <category label="Sutton & Barto" term="Sutton & Barto"/>
        <category label="DSP 2017" term="DSP 2017"/>
        <category label="Python" term="Python"/>
        <category label="Multi Armed Bandit" term="Multi Armed Bandit"/>
        <category label="Reinforcement learning" term="Reinforcement learning"/>
    </entry>
    <entry>
        <title type="html"><![CDATA[Torchcraft - analysis and changing the game state]]></title>
        <id>https://dloranc.github.io/2017/05/28/torchcraft-analysis-and-changing-the-game-state</id>
        <link href="https://dloranc.github.io/2017/05/28/torchcraft-analysis-and-changing-the-game-state"/>
        <updated>2017-05-28T00:00:00.000Z</updated>
        <summary type="html"><![CDATA[A post about how to change the game state by analyzing it and giving orders to units.]]></summary>
        <content type="html"><![CDATA[<p><em>This post is about the Starcraft bot I am developing using machine learning. The project is being developed as part of the "Daj Się Poznać 2017" competition.</em></p>
<hr>
<p>In the last <a href="https://dloranc.github.io/2017/05/21/torchcraft-basic-script">post about the project</a> I described how the maps look like and wrote how to create a basic script that connects to Starcraft and downloads the game state in a loop, or rather each subsequent logical frame games. I don't think I've written yet what a logical game frame is. The thing with logical frames is that graphics rendering is independent of calculations that change the game state. The frame rate is not constant and depends on the speed of your computer. The game state is calculated every interval. If you have played Starcraft, you probably know that you can set the game speed in the options. Changing the speed results in a change in the time between logical frame calculations. This is a fundamental difference, if we had constant 30 or 60 FPS, the issue would probably be solved differently.</p>
<h2 class="anchor anchorWithStickyNavbar_iC02" id="game-state">Game state<a class="hash-link" aria-label="Direct link to Game state" title="Direct link to Game state" href="https://dloranc.github.io/2017/05/28/torchcraft-analysis-and-changing-the-game-state#game-state">​</a></h2>
<p>OK, time to check what we have available in terms of game status. First, let me remind you of the basic script from the previous note:</p>
<div class="language-Lua language-lua codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-lua codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token keyword" style="color:#00009f">local</span><span class="token plain"> hostname </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">"192.168.56.1"</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">local</span><span class="token plain"> port </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">11111</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">local</span><span class="token plain"> tc </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> require </span><span class="token string" style="color:#e3116c">'torchcraft'</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">tc</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">micro_battles </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">true</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">tc</span><span class="token punctuation" style="color:#393A34">:</span><span class="token function" style="color:#d73a49">init</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">hostname</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> port</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">local</span><span class="token plain"> update </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> tc</span><span class="token punctuation" style="color:#393A34">:</span><span class="token function" style="color:#d73a49">connect</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">port</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">local</span><span class="token plain"> setup </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    tc</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">command</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">tc</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">set_speed</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">20</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    tc</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">command</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">tc</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">set_gui</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">1</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    tc</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">command</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">tc</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">set_cmd_optim</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">1</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">tc</span><span class="token punctuation" style="color:#393A34">:</span><span class="token function" style="color:#d73a49">send</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain">table</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">concat</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">setup</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">':'</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">}</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">while</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">not</span><span class="token plain"> tc</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">state</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">game_ended </span><span class="token keyword" style="color:#00009f">do</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    update </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> tc</span><span class="token punctuation" style="color:#393A34">:</span><span class="token function" style="color:#d73a49">receive</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token comment" style="color:#999988;font-style:italic">-- code here</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">end</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">tc</span><span class="token punctuation" style="color:#393A34">:</span><span class="token function" style="color:#d73a49">close</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>In TorchCraft, the game state changes every logical frame. You can access it by referring to <code>tc.state</code>. This variable is a table with the following keys:</p>
<div class="language-Lua language-lua codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-lua codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token comment" style="color:#999988;font-style:italic">--[[</span><br></span><span class="token-line" style="color:#393A34"><span class="token comment" style="color:#999988;font-style:italic">    state will get its content updated from bwapi, it will have</span><br></span><span class="token-line" style="color:#393A34"><span class="token comment" style="color:#999988;font-style:italic">    * map_data            : [torch.ByteTensor] 2D. 255 (-1) where not walkable</span><br></span><span class="token-line" style="color:#393A34"><span class="token comment" style="color:#999988;font-style:italic">    * map_name            : [string] Name on the current map</span><br></span><span class="token-line" style="color:#393A34"><span class="token comment" style="color:#999988;font-style:italic">    * img_mode            : [string] Image mode selected (can be empty, raw, compress)</span><br></span><span class="token-line" style="color:#393A34"><span class="token comment" style="color:#999988;font-style:italic">    * lag_frames          : [int] number of frames from order to execution</span><br></span><span class="token-line" style="color:#393A34"><span class="token comment" style="color:#999988;font-style:italic">    * frame_from_bwapi    : [int] game frame number as seen from BWAPI</span><br></span><span class="token-line" style="color:#393A34"><span class="token comment" style="color:#999988;font-style:italic">    * game_ended          : [boolean] did the game end? (i.e. did the map end)</span><br></span><span class="token-line" style="color:#393A34"><span class="token comment" style="color:#999988;font-style:italic">    * battle_just_ended   : [boolean] did the battle just end? (battle!=game)</span><br></span><span class="token-line" style="color:#393A34"><span class="token comment" style="color:#999988;font-style:italic">    * waiting_for_restart : [boolean] are we waiting to restart a new battle?</span><br></span><span class="token-line" style="color:#393A34"><span class="token comment" style="color:#999988;font-style:italic">    * battle_won          : [boolean] did we win the battle?</span><br></span><span class="token-line" style="color:#393A34"><span class="token comment" style="color:#999988;font-style:italic">    * units_myself        : [table] w/ {unitIDs: unitStates} as {keys: values}</span><br></span><span class="token-line" style="color:#393A34"><span class="token comment" style="color:#999988;font-style:italic">    * units_enemy         : [table] same as above, but for the enemy player</span><br></span><span class="token-line" style="color:#393A34"><span class="token comment" style="color:#999988;font-style:italic">    * bullets             : [table] table with all bullets (position and type)</span><br></span><span class="token-line" style="color:#393A34"><span class="token comment" style="color:#999988;font-style:italic">    * screen_position     : [table] Position of screen {x, y} in pixels. {0, 0} is top-left</span><br></span><span class="token-line" style="color:#393A34"><span class="token comment" style="color:#999988;font-style:italic">]]</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>And not only that, there is also <code>units_neutral</code> (contains animals, minerals and gases) and probably a few others, but I would have to look for that in the TorchCraft code. I took this comment above from TorchCraft's code.</p>
<p>As you can see, the basic script already uses <code>game_ended</code>.</p>
<p>An interesting structure is <code>map_data</code>. This is the <code>ByteTensor</code> from Torch and contains information about the map, such as places that units cannot go to and the like. Very useful. For <em>m5v5_c_far.scm</em> the map size is 256x256 and can be obtained by:</p>
<div class="language-Lua language-lua codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-lua codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token keyword" style="color:#00009f">local</span><span class="token plain"> map </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> tc</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">state</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">map_data</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token function" style="color:#d73a49">print</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">map</span><span class="token punctuation" style="color:#393A34">:</span><span class="token function" style="color:#d73a49">size</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>For us, the most interesting ones will be <code>units_myself</code>, <code>units_enemy</code> and <code>bullets</code>. Let's check what <code>units_myself</code> contains:</p>
<div class="language-Lua language-lua codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-lua codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token function" style="color:#d73a49">print</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">tc</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">state</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">units_myself</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token number" style="color:#36acaa">21</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">      lifted </span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">false</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">      pixel_size_x </span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">17</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">      detected </span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">true</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">      gwcd </span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">0</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">      idle </span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">false</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">      awrange </span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">16</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">      order </span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">6</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">      type </span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">0</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">      position </span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">          </span><span class="token number" style="color:#36acaa">1</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">83</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">          </span><span class="token number" style="color:#36acaa">2</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">141</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">      targetpos </span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">          </span><span class="token number" style="color:#36acaa">1</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">60</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">          </span><span class="token number" style="color:#36acaa">2</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">150</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">      energy </span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">0</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">      size </span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">1</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">      resource </span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">0</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">      gwdmgtype </span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">3</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">      pixel_y </span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">1128</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">      shieldArmor </span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">0</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">      awattack </span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">6</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">      playerId </span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">0</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">      visible </span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">1</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">      velocity </span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">          </span><span class="token number" style="color:#36acaa">1</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">0</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">          </span><span class="token number" style="color:#36acaa">2</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">0</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">      hp </span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">40</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">      awdmgtype </span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">3</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">      orders </span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">          </span><span class="token number" style="color:#36acaa">1</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">              first_frame </span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">5</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">              target </span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">-</span><span class="token number" style="color:#36acaa">1</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">              type </span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">6</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">              targetpos </span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                  </span><span class="token number" style="color:#36acaa">1</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">60</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                  </span><span class="token number" style="color:#36acaa">2</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">150</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">      max_hp </span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">40</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">      target </span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">-</span><span class="token number" style="color:#36acaa">1</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">      armor </span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">0</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">      max_shield </span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">0</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">      maxcd </span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">15</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">      gwattack </span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">6</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">      shield </span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">0</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">      awcd </span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">0</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">      pixel_x </span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">664</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">      gwrange </span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">16</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">      pixel_size_y </span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">20</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token comment" style="color:#999988;font-style:italic">-- ...</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token punctuation" style="color:#393A34">}</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>A lot of data about the unit.</p>
<h2 class="anchor anchorWithStickyNavbar_iC02" id="commands">Commands<a class="hash-link" aria-label="Direct link to Commands" title="Direct link to Commands" href="https://dloranc.github.io/2017/05/28/torchcraft-analysis-and-changing-the-game-state#commands">​</a></h2>
<p>It's time to give orders to units. Let's work through the while loop:</p>
<div class="language-Lua language-lua codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-lua codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token keyword" style="color:#00009f">local</span><span class="token plain"> give_orders </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">false</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">while</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">not</span><span class="token plain"> tc</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">state</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">game_ended </span><span class="token keyword" style="color:#00009f">do</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    update </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> tc</span><span class="token punctuation" style="color:#393A34">:</span><span class="token function" style="color:#d73a49">receive</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">local</span><span class="token plain"> actions </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> give_orders </span><span class="token operator" style="color:#393A34">==</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">false</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">then</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token keyword" style="color:#00009f">for</span><span class="token plain"> uid</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> unit </span><span class="token keyword" style="color:#00009f">in</span><span class="token plain"> </span><span class="token function" style="color:#d73a49">pairs</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">tc</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">state</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">units_myself</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">do</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            table</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">insert</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">actions</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                tc</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">command</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                    tc</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">command_unit_protected</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                    uid</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                    tc</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">cmd</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">Attack_Move</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                    </span><span class="token operator" style="color:#393A34">-</span><span class="token number" style="color:#36acaa">1</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                    </span><span class="token number" style="color:#36acaa">103</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                    </span><span class="token number" style="color:#36acaa">141</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token keyword" style="color:#00009f">end</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        give_orders </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">true</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">end</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    tc</span><span class="token punctuation" style="color:#393A34">:</span><span class="token function" style="color:#d73a49">send</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain">table</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">concat</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">actions</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">':'</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">}</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> tc</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">state</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">battle_just_ended </span><span class="token keyword" style="color:#00009f">or</span><span class="token plain"> tc</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">state</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">waiting_for_restart </span><span class="token keyword" style="color:#00009f">then</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        give_orders </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">false</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">end</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">end</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>At the very beginning of the battle, the above code gives an order to attack the place where enemy units are located. To issue commands in a given frame, we create an <code>actions</code> table:</p>
<div class="language-Lua language-lua codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-lua codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token keyword" style="color:#00009f">local</span><span class="token plain"> actions </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token punctuation" style="color:#393A34">}</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>Then we iterate over all our units:</p>
<div class="language-Lua language-lua codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-lua codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token keyword" style="color:#00009f">for</span><span class="token plain"> uid</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> unit </span><span class="token keyword" style="color:#00009f">in</span><span class="token plain"> </span><span class="token function" style="color:#d73a49">pairs</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">tc</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">state</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">units_myself</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">do</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">	</span><span class="token comment" style="color:#999988;font-style:italic">-- ...</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">end</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>The <code>pairs</code> function returns keys and values ​​separately. They are assigned to the variables <code>uid</code>, <code>unit</code>.</p>
<p>We create commands by calling <code>tc.command</code> with the specified arguments:</p>
<div class="language-Lua language-lua codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-lua codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token plain">tc</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">command</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    tc</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">command_unit_protected</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    uid</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    tc</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">cmd</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">Attack_Move</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token operator" style="color:#393A34">-</span><span class="token number" style="color:#36acaa">1</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token number" style="color:#36acaa">103</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token number" style="color:#36acaa">141</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token punctuation" style="color:#393A34">)</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>The first argument, I have no idea what it does, the second is the unit id (each is unique), the third is the command, the fourth - I have no idea, the fifth and sixth are the x and y positions. We insert the commands into the <code>actions</code> table by using <code>table.insert</code> .</p>
<p>Time to send commands to Starcraft:</p>
<div class="language-Lua language-lua codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-lua codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token plain">tc</span><span class="token punctuation" style="color:#393A34">:</span><span class="token function" style="color:#d73a49">send</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain">table</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">concat</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">actions</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">':'</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">}</span><span class="token punctuation" style="color:#393A34">)</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>The above code merges the array into a string, with each command separated by a colon, and sends.</p>
<p>The above example also includes the <code>give_orders</code> variable. I added it to avoid command spamming. Spam causes Starcraft to run slowly and units, instead of attacking what is on the way, go to a given location regardless of anything.</p>
<h2 class="anchor anchorWithStickyNavbar_iC02" id="summary">Summary<a class="hash-link" aria-label="Direct link to Summary" title="Direct link to Summary" href="https://dloranc.github.io/2017/05/28/torchcraft-analysis-and-changing-the-game-state#summary">​</a></h2>
<p>It's not difficult, it just took me some time to understand why giving the movement order doesn't work. It turned out that you need to add this magical <code>-1</code> before the x and y positions. The operation of the presented code can be seen in the gif below.</p>
<p><img decoding="async" loading="lazy" alt="TorchCraft - simple battle" src="https://dloranc.github.io/assets/images/torchcraft_battle-6f0198303553912a1788c65d444af5ec.gif" width="700" height="329" class="img_r5VP"></p>
<p>In the next post I will try to either describe more things or finally use machine learning (Q-learning). We'll see if I have time.</p>]]></content>
        <category label="Projects" term="Projects"/>
        <category label="DSP 2017" term="DSP 2017"/>
        <category label="Lua" term="Lua"/>
        <category label="Starcraft" term="Starcraft"/>
        <category label="torch" term="torch"/>
    </entry>
    <entry>
        <title type="html"><![CDATA[Multi-armed bandit - non-stationary version]]></title>
        <id>https://dloranc.github.io/2017/05/21/multi-armed-bandit-non-stationary-version</id>
        <link href="https://dloranc.github.io/2017/05/21/multi-armed-bandit-non-stationary-version"/>
        <updated>2017-05-21T00:00:00.000Z</updated>
        <summary type="html"><![CDATA[Continuation of the multi-armed bandit theme. This time about how to deal with non-stationary versions of this problem.]]></summary>
        <content type="html"><![CDATA[<p><em>This post is part of my struggle with the book <a href="http://incompleteideas.net/sutton/book/the-book-2nd.html" target="_blank" rel="noopener noreferrer">"Reinforcement Learning: An Introduction"</a> by Richard S. Sutton and Andrew G. Barto. Other posts systematizing my knowledge and presenting the code I wrote can be found under the tag <a href="https://dloranc.github.io/tags/sutton-and-barto">Sutton &amp; Barto</a> and in the repository <a href="https://github.com/dloranc/reinforcement-learning-an-introduction" target="_blank" rel="noopener noreferrer">dloranc/reinforcement-learning-an-introduction</a>.</em></p>
<hr>
<h2 class="anchor anchorWithStickyNavbar_iC02" id="non-stationary-problem">Non-stationary problem<a class="hash-link" aria-label="Direct link to Non-stationary problem" title="Direct link to Non-stationary problem" href="https://dloranc.github.io/2017/05/21/multi-armed-bandit-non-stationary-version#non-stationary-problem">​</a></h2>
<p>In this post, I will discuss a particular type of <strong>multi-armed bandit</strong> (MAB) problem, which consists in the fact that for each one-armed bandit, the value of the rewards changes over time. This is the so-called non-stationary version of <strong>MAB</strong>. Until now, the value of rewards was obtained from a certain normal distribution with a certain mean and variance (the mean for each arm was selected randomly at the beginning in the constructor).</p>
<p>It looked something like this:</p>
<p><img decoding="async" loading="lazy" alt="Stationary process" src="https://dloranc.github.io/assets/images/process_stationary-e62bdcf0ace8f43a357abee636b3eb9e.png" width="950" height="400" class="img_r5VP"></p>
<p>This is the so-called stationary process.</p>
<p>The non-stationary process looks like this and this is what we will deal with:</p>
<p><img decoding="async" loading="lazy" alt="Non-stationary process - random walk" src="https://dloranc.github.io/assets/images/process_nonstationary-10969c5146ddbd342d1050687ee76d47.png" width="950" height="400" class="img_r5VP"></p>
<h2 class="anchor anchorWithStickyNavbar_iC02" id="solution">Solution<a class="hash-link" aria-label="Direct link to Solution" title="Direct link to Solution" href="https://dloranc.github.io/2017/05/21/multi-armed-bandit-non-stationary-version#solution">​</a></h2>
<p>Hmm, since the rewards we receive change over time, we'll need something like a weighted average.</p>
<p>One thing first, let's recall the formula from the previous <a href="https://dloranc.github.io/2017/05/01/multi-armed-bandit-simple-optimization">entry</a>:</p>
<p><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>Q</mi><mrow><mi>n</mi><mo>+</mo><mn>1</mn></mrow></msub><mo>≐</mo><msub><mi>Q</mi><mi>n</mi></msub><mo>+</mo><mfrac><mn>1</mn><mi>n</mi></mfrac><mo fence="false" stretchy="true" minsize="1.8em" maxsize="1.8em">[</mo><msub><mi>R</mi><mi>n</mi></msub><mo>−</mo><msub><mi>Q</mi><mi>n</mi></msub><mo fence="false" stretchy="true" minsize="1.8em" maxsize="1.8em">]</mo></mrow><annotation encoding="application/x-tex">Q_{n+1} \doteq Q_n + \frac{1}{n}\Big[R_n - Q_n\Big]</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8917em;vertical-align:-0.2083em"></span><span class="mord"><span class="mord mathnormal">Q</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3011em"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">n</span><span class="mbin mtight">+</span><span class="mord mtight">1</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2083em"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2778em"></span><span class="mrel">≐</span><span class="mspace" style="margin-right:0.2778em"></span></span><span class="base"><span class="strut" style="height:0.8778em;vertical-align:-0.1944em"></span><span class="mord"><span class="mord mathnormal">Q</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">n</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em"></span></span><span class="base"><span class="strut" style="height:1.8em;vertical-align:-0.65em"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.8451em"><span style="top:-2.655em"><span class="pstrut" style="height:3em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">n</span></span></span></span><span style="top:-3.23em"><span class="pstrut" style="height:3em"></span><span class="frac-line" style="border-bottom-width:0.04em"></span></span><span style="top:-3.394em"><span class="pstrut" style="height:3em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight">1</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.345em"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span><span class="mord"><span class="delimsizing size2">[</span></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.00773em">R</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em"><span style="top:-2.55em;margin-left:-0.0077em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">n</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em"></span></span><span class="base"><span class="strut" style="height:1.8em;vertical-align:-0.65em"></span><span class="mord"><span class="mord mathnormal">Q</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">n</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span><span class="mord"><span class="delimsizing size2">]</span></span></span></span></span></p>
<p>Let's transform it to:</p>
<p><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>Q</mi><mrow><mi>n</mi><mo>+</mo><mn>1</mn></mrow></msub><mo>≐</mo><msub><mi>Q</mi><mi>n</mi></msub><mo>+</mo><mi>α</mi><mo fence="false" stretchy="true" minsize="1.8em" maxsize="1.8em">[</mo><msub><mi>R</mi><mi>n</mi></msub><mo>−</mo><msub><mi>Q</mi><mi>n</mi></msub><mo fence="false" stretchy="true" minsize="1.8em" maxsize="1.8em">]</mo></mrow><annotation encoding="application/x-tex">Q_{n+1} \doteq Q_n + \alpha\Big[R_n - Q_n\Big]</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8917em;vertical-align:-0.2083em"></span><span class="mord"><span class="mord mathnormal">Q</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3011em"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">n</span><span class="mbin mtight">+</span><span class="mord mtight">1</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2083em"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2778em"></span><span class="mrel">≐</span><span class="mspace" style="margin-right:0.2778em"></span></span><span class="base"><span class="strut" style="height:0.8778em;vertical-align:-0.1944em"></span><span class="mord"><span class="mord mathnormal">Q</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">n</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em"></span></span><span class="base"><span class="strut" style="height:1.8em;vertical-align:-0.65em"></span><span class="mord mathnormal" style="margin-right:0.0037em">α</span><span class="mord"><span class="delimsizing size2">[</span></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.00773em">R</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em"><span style="top:-2.55em;margin-left:-0.0077em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">n</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em"></span></span><span class="base"><span class="strut" style="height:1.8em;vertical-align:-0.65em"></span><span class="mord"><span class="mord mathnormal">Q</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">n</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span><span class="mord"><span class="delimsizing size2">]</span></span></span></span></span></p>
<p>Where <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>α</mi></mrow><annotation encoding="application/x-tex">\alpha</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.4306em"></span><span class="mord mathnormal" style="margin-right:0.0037em">α</span></span></span></span> will be some parameter, which may have some constant value or the value of <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mfrac><mn>1</mn><mi>n</mi></mfrac></mrow><annotation encoding="application/x-tex">\frac{1}{n}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.1901em;vertical-align:-0.345em"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.8451em"><span style="top:-2.655em"><span class="pstrut" style="height:3em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">n</span></span></span></span><span style="top:-3.23em"><span class="pstrut" style="height:3em"></span><span class="frac-line" style="border-bottom-width:0.04em"></span></span><span style="top:-3.394em"><span class="pstrut" style="height:3em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight">1</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.345em"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span></span></span></span> as before.</p>
<p>We transform:</p>
<span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mtable rowspacing="0.25em" columnalign="right left" columnspacing="0em"><mtr><mtd class="mtr-glue"></mtd><mtd><mstyle scriptlevel="0" displaystyle="true"><msub><mi>Q</mi><mrow><mi>n</mi><mo>+</mo><mn>1</mn></mrow></msub></mstyle></mtd><mtd><mstyle scriptlevel="0" displaystyle="true"><mrow><mrow></mrow><mo>≐</mo><msub><mi>Q</mi><mi>n</mi></msub><mo>+</mo><mi>α</mi><mo fence="false" stretchy="true" minsize="1.8em" maxsize="1.8em">[</mo><msub><mi>R</mi><mi>n</mi></msub><mo>−</mo><msub><mi>Q</mi><mi>n</mi></msub><mo fence="false" stretchy="true" minsize="1.8em" maxsize="1.8em">]</mo></mrow></mstyle></mtd><mtd class="mtr-glue"></mtd><mtd class="mml-eqn-num"></mtd></mtr><mtr><mtd class="mtr-glue"></mtd><mtd><mstyle scriptlevel="0" displaystyle="true"><mrow></mrow></mstyle></mtd><mtd><mstyle scriptlevel="0" displaystyle="true"><mrow><mrow></mrow><mo>=</mo><mi>α</mi><msub><mi>R</mi><mi>n</mi></msub><mo>+</mo><mo stretchy="false">(</mo><mn>1</mn><mo>−</mo><mi>α</mi><mo stretchy="false">)</mo><msub><mi>Q</mi><mi>n</mi></msub></mrow></mstyle></mtd><mtd class="mtr-glue"></mtd><mtd class="mml-eqn-num"></mtd></mtr><mtr><mtd class="mtr-glue"></mtd><mtd><mstyle scriptlevel="0" displaystyle="true"><mrow></mrow></mstyle></mtd><mtd><mstyle scriptlevel="0" displaystyle="true"><mrow><mrow></mrow><mo>=</mo><mi>α</mi><msub><mi>R</mi><mi>n</mi></msub><mo>+</mo><mo stretchy="false">(</mo><mn>1</mn><mo>−</mo><mi>α</mi><mo stretchy="false">)</mo><mo stretchy="false">[</mo><mi>α</mi><msub><mi>R</mi><mrow><mi>n</mi><mo>−</mo><mn>1</mn></mrow></msub><mo>+</mo><mo stretchy="false">(</mo><mn>1</mn><mo>−</mo><mi>α</mi><mo stretchy="false">)</mo><mi>Q</mi><mrow><mi>n</mi><mo>−</mo><mn>1</mn></mrow><mo stretchy="false">]</mo></mrow></mstyle></mtd><mtd class="mtr-glue"></mtd><mtd class="mml-eqn-num"></mtd></mtr><mtr><mtd class="mtr-glue"></mtd><mtd><mstyle scriptlevel="0" displaystyle="true"><mrow></mrow></mstyle></mtd><mtd><mstyle scriptlevel="0" displaystyle="true"><mrow><mrow></mrow><mo>=</mo><mi>α</mi><msub><mi>R</mi><mi>n</mi></msub><mo>+</mo><mo stretchy="false">(</mo><mn>1</mn><mo>−</mo><mi>α</mi><mo stretchy="false">)</mo><mi>α</mi><msub><mi>R</mi><mrow><mi>n</mi><mo>−</mo><mn>1</mn></mrow></msub><mo>+</mo><mo stretchy="false">(</mo><mn>1</mn><mo>−</mo><mi>α</mi><msup><mo stretchy="false">)</mo><mn>2</mn></msup><mi>Q</mi><mrow><mi>n</mi><mo>−</mo><mn>1</mn></mrow></mrow></mstyle></mtd><mtd class="mtr-glue"></mtd><mtd class="mml-eqn-num"></mtd></mtr><mtr><mtd class="mtr-glue"></mtd><mtd><mstyle scriptlevel="0" displaystyle="true"><mrow></mrow></mstyle></mtd><mtd><mstyle scriptlevel="0" displaystyle="true"><mrow><mrow></mrow><mo>=</mo><mi>α</mi><msub><mi>R</mi><mi>n</mi></msub><mo>+</mo><mo stretchy="false">(</mo><mn>1</mn><mo>−</mo><mi>α</mi><mo stretchy="false">)</mo><mi>α</mi><msub><mi>R</mi><mrow><mi>n</mi><mo>−</mo><mn>1</mn></mrow></msub><mo>+</mo><mo stretchy="false">(</mo><mn>1</mn><mo>−</mo><mi>α</mi><msup><mo stretchy="false">)</mo><mn>2</mn></msup><mi>α</mi><msub><mi>R</mi><mrow><mi>n</mi><mo>−</mo><mn>2</mn></mrow></msub><mo>+</mo></mrow></mstyle></mtd><mtd class="mtr-glue"></mtd><mtd class="mml-eqn-num"></mtd></mtr><mtr><mtd class="mtr-glue"></mtd><mtd><mstyle scriptlevel="0" displaystyle="true"><mrow></mrow></mstyle></mtd><mtd><mstyle scriptlevel="0" displaystyle="true"><mrow><mrow></mrow><mspace width="2em"></mspace><mspace width="2em"></mspace><mo>⋯</mo><mo>+</mo><mo stretchy="false">(</mo><mn>1</mn><mo>−</mo><mi>α</mi><msup><mo stretchy="false">)</mo><mrow><mi>n</mi><mo>−</mo><mn>1</mn></mrow></msup><mi>α</mi><msub><mi>R</mi><mn>1</mn></msub><mo>+</mo><mo stretchy="false">(</mo><mn>1</mn><mo>−</mo><mi>α</mi><msup><mo stretchy="false">)</mo><mi>n</mi></msup><msub><mi>Q</mi><mn>1</mn></msub></mrow></mstyle></mtd><mtd class="mtr-glue"></mtd><mtd class="mml-eqn-num"></mtd></mtr><mtr><mtd class="mtr-glue"></mtd><mtd><mstyle scriptlevel="0" displaystyle="true"><mrow></mrow></mstyle></mtd><mtd><mstyle scriptlevel="0" displaystyle="true"><mrow><mrow></mrow><mo>=</mo><mo stretchy="false">(</mo><mn>1</mn><mo>−</mo><mi>α</mi><msup><mo stretchy="false">)</mo><mi>n</mi></msup><mo>+</mo><munderover><mo>∑</mo><mrow><mi>i</mi><mo>=</mo><mn>1</mn></mrow><mi>n</mi></munderover><mi>α</mi><mo stretchy="false">(</mo><mn>1</mn><mo>−</mo><mi>α</mi><msup><mo stretchy="false">)</mo><mrow><mi>n</mi><mo>−</mo><mi>i</mi></mrow></msup><msub><mi>R</mi><mi>i</mi></msub></mrow></mstyle></mtd><mtd class="mtr-glue"></mtd><mtd class="mml-eqn-num"></mtd></mtr></mtable><annotation encoding="application/x-tex">\begin{align} Q_{n+1} &amp; \doteq Q_n + \alpha\Big[R_n - Q_n\Big] \\
&amp; = \alpha R_n + (1 - \alpha)Q_n \\
&amp; = \alpha R_n + (1 - \alpha)[\alpha R_{n - 1} + (1 - \alpha) Q{n - 1}] \\
&amp; = \alpha R_n + (1 - \alpha)\alpha R_{n - 1} + (1 - \alpha)^2 Q{n - 1} \\
&amp; = \alpha R_n + (1 - \alpha)\alpha R_{n - 1} + (1 - \alpha)^2 \alpha R_{n-2} + \\
&amp; \qquad \qquad \dots + (1 - \alpha)^{n - 1} \alpha R_1 + (1 - \alpha)^n Q_1 \\
&amp; = (1 - \alpha)^n + \sum_{i = 1}^n \alpha (1 - \alpha)^{n - i} R_i
\end{align}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:12.9014em;vertical-align:-6.2007em"></span><span class="mtable"><span class="col-align-r"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:6.7007em"><span style="top:-9.2021em"><span class="pstrut" style="height:3.6514em"></span><span class="mord"><span class="mord"><span class="mord mathnormal">Q</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3011em"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">n</span><span class="mbin mtight">+</span><span class="mord mtight">1</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2083em"><span></span></span></span></span></span></span></span></span><span style="top:-7.4121em"><span class="pstrut" style="height:3.6514em"></span><span class="mord"></span></span><span style="top:-5.9121em"><span class="pstrut" style="height:3.6514em"></span><span class="mord"></span></span><span style="top:-4.388em"><span class="pstrut" style="height:3.6514em"></span><span class="mord"></span></span><span style="top:-2.8639em"><span class="pstrut" style="height:3.6514em"></span><span class="mord"></span></span><span style="top:-1.3398em"><span class="pstrut" style="height:3.6514em"></span><span class="mord"></span></span><span style="top:0.9716em"><span class="pstrut" style="height:3.6514em"></span><span class="mord"></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:6.2007em"><span></span></span></span></span></span><span class="col-align-l"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:6.7007em"><span style="top:-9.2021em"><span class="pstrut" style="height:3.6514em"></span><span class="mord"><span class="mord"></span><span class="mspace" style="margin-right:0.2778em"></span><span class="mrel">≐</span><span class="mspace" style="margin-right:0.2778em"></span><span class="mord"><span class="mord mathnormal">Q</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">n</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mord mathnormal" style="margin-right:0.0037em">α</span><span class="mord"><span class="delimsizing size2">[</span></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.00773em">R</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em"><span style="top:-2.55em;margin-left:-0.0077em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">n</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mord"><span class="mord mathnormal">Q</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">n</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span><span class="mord"><span class="delimsizing size2">]</span></span></span></span><span style="top:-7.4121em"><span class="pstrut" style="height:3.6514em"></span><span class="mord"><span class="mord"></span><span class="mspace" style="margin-right:0.2778em"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em"></span><span class="mord mathnormal" style="margin-right:0.0037em">α</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.00773em">R</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em"><span style="top:-2.55em;margin-left:-0.0077em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">n</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mopen">(</span><span class="mord">1</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mord mathnormal" style="margin-right:0.0037em">α</span><span class="mclose">)</span><span class="mord"><span class="mord mathnormal">Q</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">n</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span></span></span><span style="top:-5.9121em"><span class="pstrut" style="height:3.6514em"></span><span class="mord"><span class="mord"></span><span class="mspace" style="margin-right:0.2778em"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em"></span><span class="mord mathnormal" style="margin-right:0.0037em">α</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.00773em">R</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em"><span style="top:-2.55em;margin-left:-0.0077em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">n</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mopen">(</span><span class="mord">1</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mord mathnormal" style="margin-right:0.0037em">α</span><span class="mclose">)</span><span class="mopen">[</span><span class="mord mathnormal" style="margin-right:0.0037em">α</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.00773em">R</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3011em"><span style="top:-2.55em;margin-left:-0.0077em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">n</span><span class="mbin mtight">−</span><span class="mord mtight">1</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2083em"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mopen">(</span><span class="mord">1</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mord mathnormal" style="margin-right:0.0037em">α</span><span class="mclose">)</span><span class="mord mathnormal">Q</span><span class="mord"><span class="mord mathnormal">n</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mord">1</span></span><span class="mclose">]</span></span></span><span style="top:-4.388em"><span class="pstrut" style="height:3.6514em"></span><span class="mord"><span class="mord"></span><span class="mspace" style="margin-right:0.2778em"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em"></span><span class="mord mathnormal" style="margin-right:0.0037em">α</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.00773em">R</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em"><span style="top:-2.55em;margin-left:-0.0077em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">n</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mopen">(</span><span class="mord">1</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mord mathnormal" style="margin-right:0.0037em">α</span><span class="mclose">)</span><span class="mord mathnormal" style="margin-right:0.0037em">α</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.00773em">R</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3011em"><span style="top:-2.55em;margin-left:-0.0077em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">n</span><span class="mbin mtight">−</span><span class="mord mtight">1</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2083em"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mopen">(</span><span class="mord">1</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mord mathnormal" style="margin-right:0.0037em">α</span><span class="mclose"><span class="mclose">)</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.8641em"><span style="top:-3.113em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">2</span></span></span></span></span></span></span></span><span class="mord mathnormal">Q</span><span class="mord"><span class="mord mathnormal">n</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mord">1</span></span></span></span><span style="top:-2.8639em"><span class="pstrut" style="height:3.6514em"></span><span class="mord"><span class="mord"></span><span class="mspace" style="margin-right:0.2778em"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em"></span><span class="mord mathnormal" style="margin-right:0.0037em">α</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.00773em">R</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em"><span style="top:-2.55em;margin-left:-0.0077em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">n</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mopen">(</span><span class="mord">1</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mord mathnormal" style="margin-right:0.0037em">α</span><span class="mclose">)</span><span class="mord mathnormal" style="margin-right:0.0037em">α</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.00773em">R</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3011em"><span style="top:-2.55em;margin-left:-0.0077em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">n</span><span class="mbin mtight">−</span><span class="mord mtight">1</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2083em"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mopen">(</span><span class="mord">1</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mord mathnormal" style="margin-right:0.0037em">α</span><span class="mclose"><span class="mclose">)</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.8641em"><span style="top:-3.113em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">2</span></span></span></span></span></span></span></span><span class="mord mathnormal" style="margin-right:0.0037em">α</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.00773em">R</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3011em"><span style="top:-2.55em;margin-left:-0.0077em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">n</span><span class="mbin mtight">−</span><span class="mord mtight">2</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2083em"><span></span></span></span></span></span></span><span class="mord">+</span></span></span><span style="top:-1.3398em"><span class="pstrut" style="height:3.6514em"></span><span class="mord"><span class="mord"></span><span class="mspace" style="margin-right:2em"></span><span class="mspace" style="margin-right:2em"></span><span class="mspace" style="margin-right:0.1667em"></span><span class="minner">⋯</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mopen">(</span><span class="mord">1</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mord mathnormal" style="margin-right:0.0037em">α</span><span class="mclose"><span class="mclose">)</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.8641em"><span style="top:-3.113em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">n</span><span class="mbin mtight">−</span><span class="mord mtight">1</span></span></span></span></span></span></span></span></span><span class="mord mathnormal" style="margin-right:0.0037em">α</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.00773em">R</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3011em"><span style="top:-2.55em;margin-left:-0.0077em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">1</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mopen">(</span><span class="mord">1</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mord mathnormal" style="margin-right:0.0037em">α</span><span class="mclose"><span class="mclose">)</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.7144em"><span style="top:-3.113em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">n</span></span></span></span></span></span></span></span><span class="mord"><span class="mord mathnormal">Q</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3011em"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">1</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span></span></span><span style="top:0.9716em"><span class="pstrut" style="height:3.6514em"></span><span class="mord"><span class="mord"></span><span class="mspace" style="margin-right:0.2778em"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em"></span><span class="mopen">(</span><span class="mord">1</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mord mathnormal" style="margin-right:0.0037em">α</span><span class="mclose"><span class="mclose">)</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.7144em"><span style="top:-3.113em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">n</span></span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mop op-limits"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.6514em"><span style="top:-1.8723em;margin-left:0em"><span class="pstrut" style="height:3.05em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">i</span><span class="mrel mtight">=</span><span class="mord mtight">1</span></span></span></span><span style="top:-3.05em"><span class="pstrut" style="height:3.05em"></span><span><span class="mop op-symbol large-op">∑</span></span></span><span style="top:-4.3em;margin-left:0em"><span class="pstrut" style="height:3.05em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">n</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:1.2777em"><span></span></span></span></span></span><span class="mspace" style="margin-right:0.1667em"></span><span class="mord mathnormal" style="margin-right:0.0037em">α</span><span class="mopen">(</span><span class="mord">1</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mord mathnormal" style="margin-right:0.0037em">α</span><span class="mclose"><span class="mclose">)</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.8747em"><span style="top:-3.113em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">n</span><span class="mbin mtight">−</span><span class="mord mathnormal mtight">i</span></span></span></span></span></span></span></span></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.00773em">R</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em"><span style="top:-2.55em;margin-left:-0.0077em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:6.2007em"><span></span></span></span></span></span></span></span><span class="tag"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:6.7007em"><span style="top:-9.2021em"><span class="pstrut" style="height:3.6514em"></span><span class="eqn-num"></span></span><span style="top:-7.4121em"><span class="pstrut" style="height:3.6514em"></span><span class="eqn-num"></span></span><span style="top:-5.9121em"><span class="pstrut" style="height:3.6514em"></span><span class="eqn-num"></span></span><span style="top:-4.388em"><span class="pstrut" style="height:3.6514em"></span><span class="eqn-num"></span></span><span style="top:-2.8639em"><span class="pstrut" style="height:3.6514em"></span><span class="eqn-num"></span></span><span style="top:-1.3398em"><span class="pstrut" style="height:3.6514em"></span><span class="eqn-num"></span></span><span style="top:0.9716em"><span class="pstrut" style="height:3.6514em"></span><span class="eqn-num"></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:6.2007em"><span></span></span></span></span></span></span></span></span>
<h2 class="anchor anchorWithStickyNavbar_iC02" id="summary">Summary<a class="hash-link" aria-label="Direct link to Summary" title="Direct link to Summary" href="https://dloranc.github.io/2017/05/21/multi-armed-bandit-non-stationary-version#summary">​</a></h2>
<p>Sprawdźmy, czy to działa:</p>
<p><img decoding="async" loading="lazy" alt="Multi-armed bandit - non-stationary version" src="https://dloranc.github.io/assets/images/rewards-92581fc54750daf2bdfa4b91b9754d7d.png" width="960" height="800" class="img_r5VP"></p>
<p>It works. That's it, it wasn't much. All we had to do was redo one equation again. I will only add that Sutton and Barto advise to use constant values ​​of <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>α</mi></mrow><annotation encoding="application/x-tex">\alpha</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.4306em"></span><span class="mord mathnormal" style="margin-right:0.0037em">α</span></span></span></span> because they perform better due to the fact that they do not meet certain conditions for series convergence and this is desirable in non-stationary problems. More details can be found in <a href="http://incompleteideas.net/sutton/book/the-book.html" target="_blank" rel="noopener noreferrer">the book</a>.</p>
<p>The code is <a href="https://github.com/dloranc/reinforcement-learning-an-introduction/blob/master/01_multi_arm_bandits/03_nonstationary.py" target="_blank" rel="noopener noreferrer">here</a>. In the next entry I will deal with another minor optimization of <strong>multi-armed bandit</strong>.</p>]]></content>
        <category label="Sutton & Barto" term="Sutton & Barto"/>
        <category label="DSP 2017" term="DSP 2017"/>
        <category label="Python" term="Python"/>
        <category label="Multi Armed Bandit" term="Multi Armed Bandit"/>
        <category label="Reinforcement learning" term="Reinforcement learning"/>
    </entry>
    <entry>
        <title type="html"><![CDATA[TorchCraft - basic script]]></title>
        <id>https://dloranc.github.io/2017/05/21/torchcraft-basic-script</id>
        <link href="https://dloranc.github.io/2017/05/21/torchcraft-basic-script"/>
        <updated>2017-05-21T00:00:00.000Z</updated>
        <summary type="html"><![CDATA[A post about the basics of working with TorchCraft. You will learn what maps in TorchCraft look like and how to create a basic script with a minimum of functionality.]]></summary>
        <content type="html"><![CDATA[<p><em>This post is about the Starcraft bot I am developing using machine learning. The project is being developed as part of the "Daj Się Poznać 2017" competition.</em></p>
<hr>
<p>This week I was going to write about Torch itself and how to create neural networks in it, but I decided that I would focus on the very basics of TorchCraft itself and its interaction with Starcraft. TorchCraft, unfortunately, has poor documentation and apart from the installation description, almost everything has to be figured out based on examples from the <code>examples</code> directory.</p>
<h2 class="anchor anchorWithStickyNavbar_iC02" id="maps">Maps<a class="hash-link" aria-label="Direct link to Maps" title="Direct link to Maps" href="https://dloranc.github.io/2017/05/21/torchcraft-basic-script#maps">​</a></h2>
<p>Let's start with the fact that TorchCraft can be run in two modes: micro and normal. In normal mode, you can play on regular multiplayer maps. There are no surprises here. Micro maps do not contain buildings and two groups of units fight on them. Maps must be specially prepared for this mode. Four maps come with TorchCraft: <code>dragoons_zealots.scm</code>, <code>m5v5_c_far.scm</code>, <code>sp_dragoons_zealots.scm</code>, <code>sp_m5v5_c_far.scm</code>. These are actually two maps in two versions each. I have no idea what the difference is, those with the <code>sp_</code> prefix maybe have something to do with single player?</p>
<p>Micro maps contain evenly spaced special, invisible units that reveal parts of the map (so-called Map Revealers), as well as Start Locations, i.e. points where units appear.</p>
<p><img decoding="async" loading="lazy" alt="dragoons_zealots.scm" src="https://dloranc.github.io/assets/images/01_dragoons_zealots-2b355bb985857b30ec6b6b3c30a0fe00.png" width="1920" height="1040" class="img_r5VP"></p>
<p>For these units to appear, so-called triggers are needed. They are created in the map editor in a special panel. Triggers allow you to do many different things, e.g. placing units on the map, controlling units, writing text on the screen, and in general - they allow you to control the game. Triggers contain two things: <code>conditions</code> - the conditions for which the trigger is fired, and <code>actions</code> - the actions triggered within a given trigger. You can also set separately for which player the trigger is activated.</p>
<p>Let's look at our micro maps:</p>
<p><img decoding="async" loading="lazy" alt="dragoons_zealots.scm" src="https://dloranc.github.io/assets/images/02_dragoons_zealots-9edec14cca680a02714513b53c3d6044.png" width="534" height="505" class="img_r5VP">
<img decoding="async" loading="lazy" alt="03_m5v5_c_far.scm" src="https://dloranc.github.io/assets/images/03_m5v5_c_far-7bdb7302ce6b85b4c638119b28bf60ef.png" width="534" height="505" class="img_r5VP">
<img decoding="async" loading="lazy" alt="03_m5v5_c_far.scm" src="https://dloranc.github.io/assets/images/04_m5v5_c_far-2e0108fd930a64e304bd5cfe890de033.png" width="534" height="505" class="img_r5VP">
<img decoding="async" loading="lazy" alt="03_m5v5_c_far.scm" src="https://dloranc.github.io/assets/images/05_m5v5_c_far-a44a8401fc31accc4491500a9a3dfe82.png" width="534" height="505" class="img_r5VP"></p>
<p>We see that triggers at the very beginning of each game create several units for both players, and if necessary, units are sent to attack.</p>
<p>Ok, enough about the maps. Time to tackle the code.</p>
<h2 class="anchor anchorWithStickyNavbar_iC02" id="the-code">The code<a class="hash-link" aria-label="Direct link to The code" title="Direct link to The code" href="https://dloranc.github.io/2017/05/21/torchcraft-basic-script#the-code">​</a></h2>
<p>First, let's write a minimal working example:</p>
<div class="language-Lua language-lua codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-lua codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token keyword" style="color:#00009f">local</span><span class="token plain"> hostname </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">"192.168.56.1"</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">local</span><span class="token plain"> port </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">11111</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">local</span><span class="token plain"> tc </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> require </span><span class="token string" style="color:#e3116c">'torchcraft'</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">tc</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">micro_battles </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">true</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">tc</span><span class="token punctuation" style="color:#393A34">:</span><span class="token function" style="color:#d73a49">init</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">hostname</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> port</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">local</span><span class="token plain"> update </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> tc</span><span class="token punctuation" style="color:#393A34">:</span><span class="token function" style="color:#d73a49">connect</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">port</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">while</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">not</span><span class="token plain"> tc</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">state</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">game_ended </span><span class="token keyword" style="color:#00009f">do</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    update </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> tc</span><span class="token punctuation" style="color:#393A34">:</span><span class="token function" style="color:#d73a49">receive</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token comment" style="color:#999988;font-style:italic">-- code here</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">end</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">tc</span><span class="token punctuation" style="color:#393A34">:</span><span class="token function" style="color:#d73a49">close</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>Let's analyze this example:</p>
<p>The first two lines are the host and port needed to connect to the host (Windows with Starcraft):</p>
<div class="language-Lua language-lua codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-lua codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token keyword" style="color:#00009f">local</span><span class="token plain"> hostname </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">"192.168.56.1"</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">local</span><span class="token plain"> port </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">11111</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>Then we load the TorchCraft library:</p>
<div class="language-Lua language-lua codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-lua codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token keyword" style="color:#00009f">local</span><span class="token plain"> tc </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> require </span><span class="token string" style="color:#e3116c">'torchcraft'</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>We switch TorchCraft to micro mode:</p>
<div class="language-Lua language-lua codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-lua codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token plain">tc</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">micro_battles </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">true</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>We start the connection:</p>
<div class="language-Lua language-lua codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-lua codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token plain">tc</span><span class="token punctuation" style="color:#393A34">:</span><span class="token function" style="color:#d73a49">init</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">hostname</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> port</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">local</span><span class="token plain"> update </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> tc</span><span class="token punctuation" style="color:#393A34">:</span><span class="token function" style="color:#d73a49">connect</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">port</span><span class="token punctuation" style="color:#393A34">)</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>Then there is a loop in which we can give orders to units. In this example, we have not defined the loop termination conditions, so infinitely many battles will be played:</p>
<div class="language-Lua language-lua codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-lua codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token keyword" style="color:#00009f">while</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">not</span><span class="token plain"> tc</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">state</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">game_ended </span><span class="token keyword" style="color:#00009f">do</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    update </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> tc</span><span class="token punctuation" style="color:#393A34">:</span><span class="token function" style="color:#d73a49">receive</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token comment" style="color:#999988;font-style:italic">-- code here</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">end</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>At the very end we close the connection:</p>
<div class="language-Lua language-lua codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-lua codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token plain">tc</span><span class="token punctuation" style="color:#393A34">:</span><span class="token function" style="color:#d73a49">close</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>As you can see, it's very simple. But let's have some more fun. Let's add some configuration above the <code>while</code> loop:</p>
<div class="language-Lua language-lua codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-lua codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token keyword" style="color:#00009f">local</span><span class="token plain"> setup </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    tc</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">command</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">tc</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">set_speed</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">20</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    tc</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">command</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">tc</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">set_gui</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">1</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    tc</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">command</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">tc</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">set_cmd_optim</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">1</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">tc</span><span class="token punctuation" style="color:#393A34">:</span><span class="token function" style="color:#d73a49">send</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain">table</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">concat</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">setup</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">':'</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">}</span><span class="token punctuation" style="color:#393A34">)</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>These are three commands. <code>set_speed</code> is responsible for the speed of the game, if we set it to zero, the game will be very fast, because the time between the execution of logical frames will be practically zero. <code>set_gui</code> set to <code>1</code> means that you can see the action. Whereas <code>set_cmd_optim</code> is the command optimization used by BWAPI, if set BWAPI tries to reduce the number of actions by grouping and executing similar commands. You can set values ​​from 0 to 4 (<a href="https://bwapi.github.io/class_b_w_a_p_i_1_1_game.html#a2e44b952a0a55416da1628237bbc82ea" target="_blank" rel="noopener noreferrer">documentation</a>).</p>
<p>Let's see what is sent to TorchCraft:</p>
<div class="language-Lua language-lua codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-lua codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token function" style="color:#d73a49">print</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain">table</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">concat</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">setup</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">':'</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">}</span><span class="token punctuation" style="color:#393A34">)</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>We will get:</p>
<div class="language-Lua language-lua codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-lua codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token number" style="color:#36acaa">1</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">"6,20:8,1:10,1"</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token punctuation" style="color:#393A34">}</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>We can see that commands are separated by colons and numeric command IDs are separated from values ​​by commas.</p>
<h2 class="anchor anchorWithStickyNavbar_iC02" id="summary">Summary<a class="hash-link" aria-label="Direct link to Summary" title="Direct link to Summary" href="https://dloranc.github.io/2017/05/21/torchcraft-basic-script#summary">​</a></h2>
<p>Okay, that's it for now. Unfortunately, TorchCraft documentation is virtually non-existent, so I haven't been able to do much. For example, I haven't yet been able to figure out how to order a unit to move to another place without attacking anything along the way. Knowledge will have to be gained by reading code and constantly testing what works and what doesn't. It won't be easy :)</p>]]></content>
        <category label="Projects" term="Projects"/>
        <category label="DSP 2017" term="DSP 2017"/>
        <category label="Lua" term="Lua"/>
        <category label="Starcraft" term="Starcraft"/>
        <category label="torch" term="torch"/>
    </entry>
    <entry>
        <title type="html"><![CDATA[Multi-armed bandit - simple optimization]]></title>
        <id>https://dloranc.github.io/2017/05/01/multi-armed-bandit-simple-optimization</id>
        <link href="https://dloranc.github.io/2017/05/01/multi-armed-bandit-simple-optimization"/>
        <updated>2017-05-01T00:00:00.000Z</updated>
        <summary type="html"><![CDATA[A post about some optimization method allowing for lower memory and CPU consumption for the algorithm presented in the previous post about the multi-armed bandit problem.]]></summary>
        <content type="html"><![CDATA[<p><em>This post is part of my struggle with the book <a href="http://incompleteideas.net/sutton/book/the-book-2nd.html" target="_blank" rel="noopener noreferrer">"Reinforcement Learning: An Introduction"</a> by Richard S. Sutton and Andrew G. Barto. Other posts systematizing my knowledge and presenting the code I wrote can be found under the tag <a href="https://dloranc.github.io/tags/sutton-and-barto">Sutton &amp; Barto</a> and in the repository <a href="https://github.com/dloranc/reinforcement-learning-an-introduction" target="_blank" rel="noopener noreferrer">dloranc/reinforcement-learning-an-introduction</a>.</em></p>
<hr>
<p>In <a href="https://dloranc.github.io/2017/04/29/attack-of-multi-armed-bandits">last post</a> I discussed the basic version of multi-armed bandit with <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>ϵ</mi></mrow><annotation encoding="application/x-tex">\epsilon</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.4306em"></span><span class="mord mathnormal">ϵ</span></span></span></span>-greedy strategy. The presented algorithm has a small drawback, as it requires recording each reward and calculating the arithmetic mean of the rewards for a given action each time the best action is selected. Not only does the algorithm require memory for rewards, as many times as there are time steps, but each time it is necessary to choose the best action, a lot of unnecessary and quite time-consuming calculations take place. Let's imagine that we have to calculate the arithmetic mean of one million prizes. How long will it take? This can be solved better.</p>
<h2 class="anchor anchorWithStickyNavbar_iC02" id="optimization">Optimization<a class="hash-link" aria-label="Direct link to Optimization" title="Direct link to Optimization" href="https://dloranc.github.io/2017/05/01/multi-armed-bandit-simple-optimization#optimization">​</a></h2>
<p>Let's recall what the current code looks like:</p>
<div class="language-python codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-python codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token keyword" style="color:#00009f">def</span><span class="token plain"> </span><span class="token function" style="color:#d73a49">__init__</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">self</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> arms</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> pulls</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> epsilon</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token comment" style="color:#999988;font-style:italic"># ...</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">rewards </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">[</span><span class="token punctuation" style="color:#393A34">[</span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">for</span><span class="token plain"> _ </span><span class="token keyword" style="color:#00009f">in</span><span class="token plain"> </span><span class="token builtin">xrange</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">arms</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">def</span><span class="token plain"> </span><span class="token function" style="color:#d73a49">get_means</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">self</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    means </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> np</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">zeros</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">arms</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">for</span><span class="token plain"> index</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> action_rewards </span><span class="token keyword" style="color:#00009f">in</span><span class="token plain"> </span><span class="token builtin">zip</span><span class="token punctuation" style="color:#393A34">(</span><span class="token builtin">range</span><span class="token punctuation" style="color:#393A34">(</span><span class="token builtin">len</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">means</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">rewards</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> </span><span class="token builtin">len</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">action_rewards</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">&gt;</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">0</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            means</span><span class="token punctuation" style="color:#393A34">[</span><span class="token plain">index</span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token builtin">sum</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">action_rewards</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">/</span><span class="token plain"> </span><span class="token builtin">len</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">action_rewards</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">return</span><span class="token plain"> means</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">def</span><span class="token plain"> </span><span class="token function" style="color:#d73a49">save_reward</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">self</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> action</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> reward</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">rewards</span><span class="token punctuation" style="color:#393A34">[</span><span class="token plain">action</span><span class="token punctuation" style="color:#393A34">]</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">append</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">reward</span><span class="token punctuation" style="color:#393A34">)</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>You can see that it counts every time. You can separate the 'means' variable here as a class field, and write new action values ​​to it when a new reward arrives. However, then we are left with the matter of calculating the arithmetic mean itself, which is computationally and memory expensive if we were to calculate it for the new prize and all the old ones.</p>
<p>Ok, time for some math. Let's define the value of a stock as <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>Q</mi><mi>n</mi></msub></mrow><annotation encoding="application/x-tex">Q_n</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8778em;vertical-align:-0.1944em"></span><span class="mord"><span class="mord mathnormal">Q</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">n</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span></span></span></span> after it has been selected <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>n</mi><mo>−</mo><mn>1</mn></mrow><annotation encoding="application/x-tex">n - 1</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6667em;vertical-align:-0.0833em"></span><span class="mord mathnormal">n</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em"></span></span><span class="base"><span class="strut" style="height:0.6444em"></span><span class="mord">1</span></span></span></span> times:</p>
<span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mtable rowspacing="0.25em" columnalign="right" columnspacing=""><mtr><mtd class="mtr-glue"></mtd><mtd><mstyle scriptlevel="0" displaystyle="true"><mrow><msub><mi>Q</mi><mi>n</mi></msub><mo>≐</mo><mfrac><mrow><msub><mi>R</mi><mn>1</mn></msub><mo>+</mo><mi>R</mi><mn>2</mn><mo>+</mo><mo>⋯</mo><mo>+</mo><msub><mi>R</mi><mrow><mi>n</mi><mo>−</mo><mn>1</mn></mrow></msub></mrow><mrow><mi>n</mi><mo>−</mo><mn>1</mn></mrow></mfrac></mrow></mstyle></mtd><mtd class="mtr-glue"></mtd><mtd class="mml-eqn-num"></mtd></mtr></mtable><annotation encoding="application/x-tex">\begin{align}
Q_n \doteq \frac{R_1 + R2 + \dots + R_{n - 1}}{n - 1}
\end{align}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:2.4297em;vertical-align:-0.9648em"></span><span class="mtable"><span class="col-align-r"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.4648em"><span style="top:-3.4648em"><span class="pstrut" style="height:3.3603em"></span><span class="mord"><span class="mord"><span class="mord mathnormal">Q</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">n</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2778em"></span><span class="mrel">≐</span><span class="mspace" style="margin-right:0.2778em"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.3603em"><span style="top:-2.314em"><span class="pstrut" style="height:3em"></span><span class="mord"><span class="mord mathnormal">n</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mord">1</span></span></span><span style="top:-3.23em"><span class="pstrut" style="height:3em"></span><span class="frac-line" style="border-bottom-width:0.04em"></span></span><span style="top:-3.677em"><span class="pstrut" style="height:3em"></span><span class="mord"><span class="mord"><span class="mord mathnormal" style="margin-right:0.00773em">R</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3011em"><span style="top:-2.55em;margin-left:-0.0077em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">1</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mord mathnormal" style="margin-right:0.00773em">R</span><span class="mord">2</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em"></span><span class="minner">⋯</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.00773em">R</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3011em"><span style="top:-2.55em;margin-left:-0.0077em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">n</span><span class="mbin mtight">−</span><span class="mord mtight">1</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2083em"><span></span></span></span></span></span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.7693em"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.9648em"><span></span></span></span></span></span></span></span><span class="tag"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.4648em"><span style="top:-3.4648em"><span class="pstrut" style="height:3.3603em"></span><span class="eqn-num"></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.9648em"><span></span></span></span></span></span></span></span></span>
<p>And let's transform it into a better version:</p>
<span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mtable rowspacing="0.25em" columnalign="right" columnspacing=""><mtr><mtd class="mtr-glue"></mtd><mtd><mstyle scriptlevel="0" displaystyle="true"><mrow><msub><mi>Q</mi><mrow><mi>n</mi><mo>+</mo><mn>1</mn></mrow></msub><mo>=</mo><mfrac><mn>1</mn><mi>n</mi></mfrac><munderover><mo>∑</mo><mrow><mi>i</mi><mo>=</mo><mn>1</mn></mrow><mi>n</mi></munderover><msub><mi>R</mi><mi>i</mi></msub></mrow></mstyle></mtd><mtd class="mtr-glue"></mtd><mtd class="mml-eqn-num"></mtd></mtr><mtr><mtd class="mtr-glue"></mtd><mtd><mstyle scriptlevel="0" displaystyle="true"><mrow><mo>=</mo><mfrac><mn>1</mn><mi>n</mi></mfrac><mrow><mo fence="true">(</mo><msub><mi>R</mi><mi>n</mi></msub><mo>+</mo><munderover><mo>∑</mo><mrow><mi>i</mi><mo>=</mo><mn>1</mn></mrow><mrow><mi>n</mi><mo>−</mo><mn>1</mn></mrow></munderover><msub><mi>R</mi><mi>i</mi></msub><mo fence="true">)</mo></mrow></mrow></mstyle></mtd><mtd class="mtr-glue"></mtd><mtd class="mml-eqn-num"></mtd></mtr><mtr><mtd class="mtr-glue"></mtd><mtd><mstyle scriptlevel="0" displaystyle="true"><mrow><mo>=</mo><mfrac><mn>1</mn><mi>n</mi></mfrac><mrow><mo fence="true">(</mo><msub><mi>R</mi><mi>n</mi></msub><mo>+</mo><mrow><mo fence="true">(</mo><mi>n</mi><mo>−</mo><mn>1</mn><mo fence="true">)</mo></mrow><mfrac><mn>1</mn><mrow><mi>n</mi><mo>−</mo><mn>1</mn></mrow></mfrac><munderover><mo>∑</mo><mrow><mi>i</mi><mo>=</mo><mn>1</mn></mrow><mrow><mi>n</mi><mo>−</mo><mn>1</mn></mrow></munderover><msub><mi>R</mi><mi>i</mi></msub><mo fence="true">)</mo></mrow></mrow></mstyle></mtd><mtd class="mtr-glue"></mtd><mtd class="mml-eqn-num"></mtd></mtr><mtr><mtd class="mtr-glue"></mtd><mtd><mstyle scriptlevel="0" displaystyle="true"><mrow><mo>=</mo><mfrac><mn>1</mn><mi>n</mi></mfrac><mrow><mo fence="true">(</mo><msub><mi>R</mi><mi>n</mi></msub><mo>+</mo><mrow><mo fence="true">(</mo><mi>n</mi><mo>−</mo><mn>1</mn><mo fence="true">)</mo></mrow><msub><mi>Q</mi><mi>n</mi></msub><mo fence="true">)</mo></mrow></mrow></mstyle></mtd><mtd class="mtr-glue"></mtd><mtd class="mml-eqn-num"></mtd></mtr><mtr><mtd class="mtr-glue"></mtd><mtd><mstyle scriptlevel="0" displaystyle="true"><mrow><mo>=</mo><mfrac><mn>1</mn><mi>n</mi></mfrac><mrow><mo fence="true">(</mo><msub><mi>R</mi><mi>n</mi></msub><mo>+</mo><mi>n</mi><msub><mi>Q</mi><mi>n</mi></msub><mo>−</mo><msub><mi>Q</mi><mi>n</mi></msub><mo fence="true">)</mo></mrow></mrow></mstyle></mtd><mtd class="mtr-glue"></mtd><mtd class="mml-eqn-num"></mtd></mtr><mtr><mtd class="mtr-glue"></mtd><mtd><mstyle scriptlevel="0" displaystyle="true"><mrow><mo>=</mo><msub><mi>Q</mi><mi>n</mi></msub><mo>+</mo><mfrac><mn>1</mn><mi>n</mi></mfrac><mrow><mo fence="true">[</mo><msub><mi>R</mi><mi>n</mi></msub><mo>−</mo><msub><mi>Q</mi><mi>n</mi></msub><mo fence="true">]</mo></mrow></mrow></mstyle></mtd><mtd class="mtr-glue"></mtd><mtd class="mml-eqn-num"></mtd></mtr></mtable><annotation encoding="application/x-tex">\begin{align}
Q_{n+1} = \frac{1}{n}\sum_{i=1}^{n}R_i \\
= \frac{1}{n}\left(R_n + \sum_{i = 1}^{n - 1} R_i\right) \\
= \frac{1}{n}\left(R_n + \left(n - 1\right)\frac{1}{n - 1} \sum_{i = 1}^{n - 1} R_i\right) \\
= \frac{1}{n}\left(R_n + \left(n - 1\right)Q_n\right) \\
= \frac{1}{n}\left(R_n + nQ_n - Q_n\right) \\
= Q_n + \frac{1}{n}\left[R_n - Q_n\right]
\end{align}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:16.9089em;vertical-align:-8.2045em"></span><span class="mtable"><span class="col-align-r"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:8.7045em"><span style="top:-10.8542em"><span class="pstrut" style="height:3.8011em"></span><span class="mord"><span class="mord"><span class="mord mathnormal">Q</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3011em"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">n</span><span class="mbin mtight">+</span><span class="mord mtight">1</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2083em"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2778em"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.3214em"><span style="top:-2.314em"><span class="pstrut" style="height:3em"></span><span class="mord"><span class="mord mathnormal">n</span></span></span><span style="top:-3.23em"><span class="pstrut" style="height:3em"></span><span class="frac-line" style="border-bottom-width:0.04em"></span></span><span style="top:-3.677em"><span class="pstrut" style="height:3em"></span><span class="mord"><span class="mord">1</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.686em"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span><span class="mspace" style="margin-right:0.1667em"></span><span class="mop op-limits"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.6514em"><span style="top:-1.8723em;margin-left:0em"><span class="pstrut" style="height:3.05em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">i</span><span class="mrel mtight">=</span><span class="mord mtight">1</span></span></span></span><span style="top:-3.05em"><span class="pstrut" style="height:3.05em"></span><span><span class="mop op-symbol large-op">∑</span></span></span><span style="top:-4.3em;margin-left:0em"><span class="pstrut" style="height:3.05em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">n</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:1.2777em"><span></span></span></span></span></span><span class="mspace" style="margin-right:0.1667em"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.00773em">R</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em"><span style="top:-2.55em;margin-left:-0.0077em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span></span></span><span style="top:-7.4754em"><span class="pstrut" style="height:3.8011em"></span><span class="mord"><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.3214em"><span style="top:-2.314em"><span class="pstrut" style="height:3em"></span><span class="mord"><span class="mord mathnormal">n</span></span></span><span style="top:-3.23em"><span class="pstrut" style="height:3em"></span><span class="frac-line" style="border-bottom-width:0.04em"></span></span><span style="top:-3.677em"><span class="pstrut" style="height:3em"></span><span class="mord"><span class="mord">1</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.686em"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span><span class="mspace" style="margin-right:0.1667em"></span><span class="minner"><span class="mopen delimcenter" style="top:0em"><span class="delimsizing size4">(</span></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.00773em">R</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em"><span style="top:-2.55em;margin-left:-0.0077em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">n</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mop op-limits"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.8011em"><span style="top:-1.8723em;margin-left:0em"><span class="pstrut" style="height:3.05em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">i</span><span class="mrel mtight">=</span><span class="mord mtight">1</span></span></span></span><span style="top:-3.05em"><span class="pstrut" style="height:3.05em"></span><span><span class="mop op-symbol large-op">∑</span></span></span><span style="top:-4.3em;margin-left:0em"><span class="pstrut" style="height:3.05em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">n</span><span class="mbin mtight">−</span><span class="mord mtight">1</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:1.2777em"><span></span></span></span></span></span><span class="mspace" style="margin-right:0.1667em"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.00773em">R</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em"><span style="top:-2.55em;margin-left:-0.0077em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span><span class="mclose delimcenter" style="top:0em"><span class="delimsizing size4">)</span></span></span></span></span><span style="top:-4.0966em"><span class="pstrut" style="height:3.8011em"></span><span class="mord"><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.3214em"><span style="top:-2.314em"><span class="pstrut" style="height:3em"></span><span class="mord"><span class="mord mathnormal">n</span></span></span><span style="top:-3.23em"><span class="pstrut" style="height:3em"></span><span class="frac-line" style="border-bottom-width:0.04em"></span></span><span style="top:-3.677em"><span class="pstrut" style="height:3em"></span><span class="mord"><span class="mord">1</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.686em"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span><span class="mspace" style="margin-right:0.1667em"></span><span class="minner"><span class="mopen delimcenter" style="top:0em"><span class="delimsizing size4">(</span></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.00773em">R</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em"><span style="top:-2.55em;margin-left:-0.0077em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">n</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em"></span><span class="minner"><span class="mopen delimcenter" style="top:0em">(</span><span class="mord mathnormal">n</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mord">1</span><span class="mclose delimcenter" style="top:0em">)</span></span><span class="mspace" style="margin-right:0.1667em"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.3214em"><span style="top:-2.314em"><span class="pstrut" style="height:3em"></span><span class="mord"><span class="mord mathnormal">n</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mord">1</span></span></span><span style="top:-3.23em"><span class="pstrut" style="height:3em"></span><span class="frac-line" style="border-bottom-width:0.04em"></span></span><span style="top:-3.677em"><span class="pstrut" style="height:3em"></span><span class="mord"><span class="mord">1</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.7693em"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span><span class="mspace" style="margin-right:0.1667em"></span><span class="mop op-limits"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.8011em"><span style="top:-1.8723em;margin-left:0em"><span class="pstrut" style="height:3.05em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">i</span><span class="mrel mtight">=</span><span class="mord mtight">1</span></span></span></span><span style="top:-3.05em"><span class="pstrut" style="height:3.05em"></span><span><span class="mop op-symbol large-op">∑</span></span></span><span style="top:-4.3em;margin-left:0em"><span class="pstrut" style="height:3.05em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">n</span><span class="mbin mtight">−</span><span class="mord mtight">1</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:1.2777em"><span></span></span></span></span></span><span class="mspace" style="margin-right:0.1667em"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.00773em">R</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em"><span style="top:-2.55em;margin-left:-0.0077em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span><span class="mclose delimcenter" style="top:0em"><span class="delimsizing size4">)</span></span></span></span></span><span style="top:-1.1975em"><span class="pstrut" style="height:3.8011em"></span><span class="mord"><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.3214em"><span style="top:-2.314em"><span class="pstrut" style="height:3em"></span><span class="mord"><span class="mord mathnormal">n</span></span></span><span style="top:-3.23em"><span class="pstrut" style="height:3em"></span><span class="frac-line" style="border-bottom-width:0.04em"></span></span><span style="top:-3.677em"><span class="pstrut" style="height:3em"></span><span class="mord"><span class="mord">1</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.686em"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span><span class="mspace" style="margin-right:0.1667em"></span><span class="minner"><span class="mopen delimcenter" style="top:0em">(</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.00773em">R</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em"><span style="top:-2.55em;margin-left:-0.0077em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">n</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em"></span><span class="minner"><span class="mopen delimcenter" style="top:0em">(</span><span class="mord mathnormal">n</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mord">1</span><span class="mclose delimcenter" style="top:0em">)</span></span><span class="mspace" style="margin-right:0.1667em"></span><span class="mord"><span class="mord mathnormal">Q</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">n</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span><span class="mclose delimcenter" style="top:0em">)</span></span></span></span><span style="top:1.1099em"><span class="pstrut" style="height:3.8011em"></span><span class="mord"><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.3214em"><span style="top:-2.314em"><span class="pstrut" style="height:3em"></span><span class="mord"><span class="mord mathnormal">n</span></span></span><span style="top:-3.23em"><span class="pstrut" style="height:3em"></span><span class="frac-line" style="border-bottom-width:0.04em"></span></span><span style="top:-3.677em"><span class="pstrut" style="height:3em"></span><span class="mord"><span class="mord">1</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.686em"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span><span class="mspace" style="margin-right:0.1667em"></span><span class="minner"><span class="mopen delimcenter" style="top:0em">(</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.00773em">R</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em"><span style="top:-2.55em;margin-left:-0.0077em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">n</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mord mathnormal">n</span><span class="mord"><span class="mord mathnormal">Q</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">n</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mord"><span class="mord mathnormal">Q</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">n</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span><span class="mclose delimcenter" style="top:0em">)</span></span></span></span><span style="top:3.4174em"><span class="pstrut" style="height:3.8011em"></span><span class="mord"><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em"></span><span class="mord"><span class="mord mathnormal">Q</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">n</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.3214em"><span style="top:-2.314em"><span class="pstrut" style="height:3em"></span><span class="mord"><span class="mord mathnormal">n</span></span></span><span style="top:-3.23em"><span class="pstrut" style="height:3em"></span><span class="frac-line" style="border-bottom-width:0.04em"></span></span><span style="top:-3.677em"><span class="pstrut" style="height:3em"></span><span class="mord"><span class="mord">1</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.686em"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span><span class="mspace" style="margin-right:0.1667em"></span><span class="minner"><span class="mopen delimcenter" style="top:0em">[</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.00773em">R</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em"><span style="top:-2.55em;margin-left:-0.0077em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">n</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mord"><span class="mord mathnormal">Q</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">n</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span><span class="mclose delimcenter" style="top:0em">]</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:8.2045em"><span></span></span></span></span></span></span></span><span class="tag"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:8.7045em"><span style="top:-10.8542em"><span class="pstrut" style="height:3.8011em"></span><span class="eqn-num"></span></span><span style="top:-7.4754em"><span class="pstrut" style="height:3.8011em"></span><span class="eqn-num"></span></span><span style="top:-4.0966em"><span class="pstrut" style="height:3.8011em"></span><span class="eqn-num"></span></span><span style="top:-1.1975em"><span class="pstrut" style="height:3.8011em"></span><span class="eqn-num"></span></span><span style="top:1.1099em"><span class="pstrut" style="height:3.8011em"></span><span class="eqn-num"></span></span><span style="top:3.4174em"><span class="pstrut" style="height:3.8011em"></span><span class="eqn-num"></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:8.2045em"><span></span></span></span></span></span></span></span></span>
<p>As you can see, we no longer need to remember all the rewards. Just remember the last value of <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>Q</mi><mi>n</mi></msub></mrow><annotation encoding="application/x-tex">Q_n</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8778em;vertical-align:-0.1944em"></span><span class="mord"><span class="mord mathnormal">Q</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">n</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span></span></span></span> and <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>n</mi></mrow><annotation encoding="application/x-tex">n</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.4306em"></span><span class="mord mathnormal">n</span></span></span></span> for each action.</p>
<h2 class="anchor anchorWithStickyNavbar_iC02" id="the-code">The code<a class="hash-link" aria-label="Direct link to The code" title="Direct link to The code" href="https://dloranc.github.io/2017/05/01/multi-armed-bandit-simple-optimization#the-code">​</a></h2>
<p>In the constructor, we get rid of <code>self.rewards = [[] for _ in xrange(self.arms)]</code> in favor of <code>self.rewards = np.zeros(self.arms)</code> and add <code>action_count</code> (our <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>n</mi></mrow><annotation encoding="application/x-tex">n</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.4306em"></span><span class="mord mathnormal">n</span></span></span></span> to remember):</p>
<div class="language-python codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-python codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token keyword" style="color:#00009f">def</span><span class="token plain"> </span><span class="token function" style="color:#d73a49">__init__</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">self</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> arms</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> pulls</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> epsilon</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">action_count </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> np</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">zeros</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">arms</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">rewards </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> np</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">zeros</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">arms</span><span class="token punctuation" style="color:#393A34">)</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>The <code>save_reward</code> function looks like this:</p>
<div class="language-python codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-python codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token keyword" style="color:#00009f">def</span><span class="token plain"> </span><span class="token function" style="color:#d73a49">save_reward</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">self</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> action</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> reward</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token comment" style="color:#999988;font-style:italic"># there is another reward, so we increase n by one</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">action_count</span><span class="token punctuation" style="color:#393A34">[</span><span class="token plain">action</span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">+=</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">1</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token comment" style="color:#999988;font-style:italic"># calculate Q(A) = Q(A) + 1 / N(A)[R - Q(A)]</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">rewards</span><span class="token punctuation" style="color:#393A34">[</span><span class="token plain">action</span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">rewards</span><span class="token punctuation" style="color:#393A34">[</span><span class="token plain">action</span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">+</span><span class="token plain"> \</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token number" style="color:#36acaa">1.</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">/</span><span class="token plain"> self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">action_count</span><span class="token punctuation" style="color:#393A34">[</span><span class="token plain">action</span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">*</span><span class="token plain"> \</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">reward </span><span class="token operator" style="color:#393A34">-</span><span class="token plain"> self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">rewards</span><span class="token punctuation" style="color:#393A34">[</span><span class="token plain">action</span><span class="token punctuation" style="color:#393A34">]</span><span class="token punctuation" style="color:#393A34">)</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<h2 class="anchor anchorWithStickyNavbar_iC02" id="summary">Summary<a class="hash-link" aria-label="Direct link to Summary" title="Direct link to Summary" href="https://dloranc.github.io/2017/05/01/multi-armed-bandit-simple-optimization#summary">​</a></h2>
<p>It was very simple. The entire script is in <a href="https://github.com/dloranc/reinforcement-learning-an-introduction/blob/master/01_multi_arm_bandits/02_incremental.py" target="_blank" rel="noopener noreferrer">repository</a> along with the first one, which is not very optimal. After this code modification, it is worth comparing the execution times of both scripts <code>01_simple.py</code> and <code>02_incremental.py</code> from the repository. To do this you need to use the <code>time</code> command. Let's check, both scripts contain quite a time-consuming experiment with a large number of calculations.</p>
<div class="language-shell codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-shell codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token plain">$ </span><span class="token function" style="color:#d73a49">time</span><span class="token plain"> python 01_simple.py</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">real    2m10.297s</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">user    2m3.564s</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">sys     0m0.436s</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<div class="language-shell codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-shell codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token plain">$ </span><span class="token function" style="color:#d73a49">time</span><span class="token plain"> python 02_incremental.py</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">real    0m57.918s</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">user    0m57.268s</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">sys     0m0.304s</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>As you can see, the difference is significant. That's it for now, the next post will be about multi-armed bandit in the non-stationary version.</p>]]></content>
        <category label="Sutton & Barto" term="Sutton & Barto"/>
        <category label="DSP 2017" term="DSP 2017"/>
        <category label="Python" term="Python"/>
        <category label="Multi Armed Bandit" term="Multi Armed Bandit"/>
        <category label="Reinforcement learning" term="Reinforcement learning"/>
    </entry>
    <entry>
        <title type="html"><![CDATA[TorchCraft installation]]></title>
        <id>https://dloranc.github.io/2017/04/30/torchcraft-installation</id>
        <link href="https://dloranc.github.io/2017/04/30/torchcraft-installation"/>
        <updated>2017-04-30T00:00:00.000Z</updated>
        <summary type="html"><![CDATA[What's up with my project? Weakly, I'm agonizing over setting up TorchCraft to get it working.]]></summary>
        <content type="html"><![CDATA[<p><em>This post is about the Starcraft bot I am developing using machine learning. The project is being developed as part of the "Daj Się Poznać 2017" competition.</em></p>
<hr>
<p>This whole week I was wondering what path to take for my bot project. I wanted to use Deeplearning4j, but this library conflicts with BWMirror, which requires Java 32-Bit. An alternative is to rewrite the entire project from BWMirror to JNIBWAPI. The second alternative is to rewrite the project to C++, which I am not very interested in because I don't feel good in this language. I mean, not in Java either, but writing in Java is easier to learn. However, I finally decided to take up TorchCraft. Of course, I lose the opportunity to participate in SSCAIT, but I think it will be better to finally do something related to reinforcement learning using a ready-made environment. If I had to do the same in Java, it would take me a lot of time.</p>
<h2 class="anchor anchorWithStickyNavbar_iC02" id="installation-on-linux">Installation on Linux<a class="hash-link" aria-label="Direct link to Installation on Linux" title="Direct link to Installation on Linux" href="https://dloranc.github.io/2017/04/30/torchcraft-installation#installation-on-linux">​</a></h2>
<p>It took me a while to install Torch, Lua and TorchCraft. As it happens, when using the instructions on <a href="https://github.com/TorchCraft/TorchCraft/blob/master/docs/user/installation.md" target="_blank" rel="noopener noreferrer">the project website</a> something didn't work. After quite a long time of thinking about what to do to set everything up, I came to the conclusion that it is best to install Torch using the instructions from <a href="http://torch.ch/docs/getting-started.html#_" target="_blank" rel="noopener noreferrer">this website</a>.</p>
<p>Going back to <a href="https://github.com/TorchCraft/TorchCraft/blob/master/docs/user/installation.md" target="_blank" rel="noopener noreferrer">TorchCraft installation description</a> instead of:</p>
<div class="language-bash codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-bash codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token function" style="color:#d73a49">git</span><span class="token plain"> clone git@github.com:torchcraft/torchcraft.git </span><span class="token parameter variable" style="color:#36acaa">--recursive</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>You need to clone:</p>
<div class="language-bash codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-bash codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token function" style="color:#d73a49">git</span><span class="token plain"> clone https://github.com/TorchCraft/TorchCraft.git </span><span class="token parameter variable" style="color:#36acaa">--recursive</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<h2 class="anchor anchorWithStickyNavbar_iC02" id="installation-on-windows">Installation on Windows<a class="hash-link" aria-label="Direct link to Installation on Windows" title="Direct link to Installation on Windows" href="https://dloranc.github.io/2017/04/30/torchcraft-installation#installation-on-windows">​</a></h2>
<p>Here it was quite simple, because everything was limited to copying files. I used <strong>BVEnv.exe</strong> from the manual.</p>
<h2 class="anchor anchorWithStickyNavbar_iC02" id="tying-everything-together">Tying everything together<a class="hash-link" aria-label="Direct link to Tying everything together" title="Direct link to Tying everything together" href="https://dloranc.github.io/2017/04/30/torchcraft-installation#tying-everything-together">​</a></h2>
<p>I installed everything. Time to launch it. On the Linux virtual machine, you need to run the following in the console:</p>
<div class="language-bash codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-bash codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token plain">th simple_exe.lua </span><span class="token parameter variable" style="color:#36acaa">-t</span><span class="token plain"> </span><span class="token variable" style="color:#36acaa">$server_ip</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>Instead of <strong>$server_ip</strong> you need to substitute the host IP. To do this, you need to use the <code>ipconfig</code> command in Windows and substitute one of the addresses that appear. There's probably nothing else you need to do in Linux.</p>
<p>It's worse on Windows. I ran <strong>BVEnv.exe</strong> and there's one thing I can't figure out, which is that after running this file, something like this is displayed:</p>
<div class="codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-text codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token plain">Welcome to the Brood War TorchCraft Environment.</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">Compiled on Mar 18 2017, 07:48:54.</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">&lt;Config Info&gt;</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">  loaded: 1</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">  current path: C:/StarCraft/bwapi-data/torchcraft.ini</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">general</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">  port = 0</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">  log_path = C:/tc_data/torchcraft_log_cpp_port_</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">  display_log = 0</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">  img_mode = raw</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">  window_mode = windows</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">  window_mode_custom =</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">  img_save_path = C:/tc_data/output_</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">starcraft</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">  assume_on = 0</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">  launcher = injectory</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">  custom_launcher =</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">C:/StarCraft//TorchCraft/\BWEnv\bin\injectory.x86.exe --launch C:/StarCraft/\StarCraft.exe --inject C:/StarCraft/\bwapi-data\BWAPI.dll</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">While running command:</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        00996F98</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">CreateProcess failed: 2</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">Connecting...</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>The path to Starcraft is incorrect, it is in a completely different place. I couldn't find out whether it is possible to pass any arguments to BWEnv.exe. The code shows that there is no such thing, so I will have to change and recompile this project.</p>
<h2 class="anchor anchorWithStickyNavbar_iC02" id="summary">Summary<a class="hash-link" aria-label="Direct link to Summary" title="Direct link to Summary" href="https://dloranc.github.io/2017/04/30/torchcraft-installation#summary">​</a></h2>
<p>I didn't get much done this week. That's the life, I always have to be stopped by some stupid technical problems. I'll try to put it together in the near future. To make things more funny, something went wrong with the bot's Java project, and I didn't change anything in the code. When I try to compile the bot, I get the following error: <code>Exception in thread "main" java.lang.UnsatisfiedLinkError: \{path\}\bwapi_bridge2_5.dll: Can't find dependent libraries</code>, even though all DLLs are where they were and it worked before. How it's even possible?</p>
<p>As a consolation, I must say that my bot, despite its simplicity, is doing well on the SSCAIT ladder. I achieved a 60% winratio (39 wins, 42 losses), which in my opinion is a good result for such a simple bot.</p>]]></content>
        <category label="Projects" term="Projects"/>
        <category label="DSP 2017" term="DSP 2017"/>
        <category label="Starcraft" term="Starcraft"/>
        <category label="bwapi" term="bwapi"/>
    </entry>
    <entry>
        <title type="html"><![CDATA[Attack of multi-armed bandits]]></title>
        <id>https://dloranc.github.io/2017/04/29/attack-of-multi-armed-bandits</id>
        <link href="https://dloranc.github.io/2017/04/29/attack-of-multi-armed-bandits"/>
        <updated>2017-04-29T00:00:00.000Z</updated>
        <summary type="html"><![CDATA[Post about the multi-armed bandits - an interesting reinforcement learning problem.]]></summary>
        <content type="html"><![CDATA[<p><em>This post is part of my struggle with the book <a href="http://incompleteideas.net/sutton/book/the-book-2nd.html" target="_blank" rel="noopener noreferrer">"Reinforcement Learning: An Introduction"</a> by Richard S. Sutton and Andrew G. Barto. Other posts systematizing my knowledge and presenting the code I wrote can be found under the tag <a href="https://dloranc.github.io/tags/sutton-and-barto">Sutton &amp; Barto</a> and in the repository <a href="https://github.com/dloranc/reinforcement-learning-an-introduction" target="_blank" rel="noopener noreferrer">dloranc/reinforcement-learning-an-introduction</a>.</em></p>
<hr>
<p><strong>Multi-armed bandit problem</strong> (or <strong>k-armed bandit problem</strong>) is one of the reinforcement learning problems, I don't know if it's the simplest one, but it allows for a relatively quick introduction to the subject and to become familiar with the basic concepts.</p>
<p>Let's imagine that we are in a casino and we are playing on a dozen or so one-armed bandits. Slot machines are different. For pulling the lever in some of them we can get a bigger reward than in others. The reward we can receive is from a certain probability distribution, sometimes larger, sometimes smaller. For now, we assume that the distribution does not change over time. Our goal is to find a slot machine with the highest accumulated prize value. We have to spend some time looking for a good machine, but we want to use the most optimal one as quickly as possible.</p>
<p>The above example with slot machines is, of course, quite artificial. <strong>Multi-armed bandit</strong> in practice can be used for things such as:</p>
<ul>
<li><em>clinical trials</em> - to find the most effective experimental therapy and minimize the negative effects of these weaker therapies on patients.</li>
<li><em>investment portfolio management</em> - to find the best strategy for our investment portfolio.</li>
<li><em>instead of A/B tests</em> - to reduce conversion losses, in A/B tests traffic is divided equally, regardless of how each version of the website performs. In <strong>MAB</strong> the movement is gradually changing towards the best version.</li>
</ul>
<h2 class="anchor anchorWithStickyNavbar_iC02" id="epsilon-greedy-strategy">Epsilon-greedy strategy<a class="hash-link" aria-label="Direct link to Epsilon-greedy strategy" title="Direct link to Epsilon-greedy strategy" href="https://dloranc.github.io/2017/04/29/attack-of-multi-armed-bandits#epsilon-greedy-strategy">​</a></h2>
<p>Ok, let's start with the simplest strategy for solving <strong>MAB</strong>, which is the <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>ϵ</mi></mrow><annotation encoding="application/x-tex">\epsilon</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.4306em"></span><span class="mord mathnormal">ϵ</span></span></span></span>-greedy strategy. What we are looking for is the so-called action value. Let us denote the action selected in step <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>t</mi></mrow><annotation encoding="application/x-tex">t</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6151em"></span><span class="mord mathnormal">t</span></span></span></span> as <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>A</mi><mi>t</mi></msub></mrow><annotation encoding="application/x-tex">A_t</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8333em;vertical-align:-0.15em"></span><span class="mord"><span class="mord mathnormal">A</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.2806em"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">t</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span></span></span></span> and the corresponding reward as <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>R</mi><mi>t</mi></msub></mrow><annotation encoding="application/x-tex">R_t</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8333em;vertical-align:-0.15em"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.00773em">R</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.2806em"><span style="top:-2.55em;margin-left:-0.0077em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">t</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span></span></span></span>. The <em>value</em> of the selected action, denoted as <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>q</mi><mstyle mathcolor="#cc0000"><mtext>\*</mtext></mstyle></msub><mo stretchy="false">(</mo><mi>a</mi><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">q_\*(a)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.1052em;vertical-align:-0.3552em"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.03588em">q</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3448em"><span style="top:-2.5198em;margin-left:-0.0359em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord text mtight" style="color:#cc0000"><span class="mord mtight" style="color:#cc0000">\*</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.3552em"><span></span></span></span></span></span></span><span class="mopen">(</span><span class="mord mathnormal">a</span><span class="mclose">)</span></span></span></span>, is the expected reward (average of rewards) when action <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>a</mi></mrow><annotation encoding="application/x-tex">a</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.4306em"></span><span class="mord mathnormal">a</span></span></span></span> is selected:</p>
<span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mtable rowspacing="0.25em" columnalign="right" columnspacing=""><mtr><mtd class="mtr-glue"></mtd><mtd><mstyle scriptlevel="0" displaystyle="true"><mrow><msub><mi>q</mi><mo>∗</mo></msub><mo stretchy="false">(</mo><mi>a</mi><mo stretchy="false">)</mo><mo>=</mo><mi mathvariant="double-struck">E</mi><mo stretchy="false">[</mo><msub><mi>R</mi><mi>t</mi></msub><mi mathvariant="normal">∣</mi><msub><mi>A</mi><mi>t</mi></msub><mo>=</mo><mi>a</mi><mo stretchy="false">]</mo></mrow></mstyle></mtd><mtd class="mtr-glue"></mtd><mtd class="mml-eqn-num"></mtd></mtr></mtable><annotation encoding="application/x-tex">\begin{align}
q_*(a) = \mathbb{E}[R_t | A_t = a]
\end{align}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.5em;vertical-align:-0.5em"></span><span class="mtable"><span class="col-align-r"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1em"><span style="top:-3.16em"><span class="pstrut" style="height:3em"></span><span class="mord"><span class="mord"><span class="mord mathnormal" style="margin-right:0.03588em">q</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1757em"><span style="top:-2.55em;margin-left:-0.0359em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mbin mtight">∗</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span><span class="mopen">(</span><span class="mord mathnormal">a</span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2778em"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em"></span><span class="mord mathbb">E</span><span class="mopen">[</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.00773em">R</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.2806em"><span style="top:-2.55em;margin-left:-0.0077em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">t</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span><span class="mord">∣</span><span class="mord"><span class="mord mathnormal">A</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.2806em"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">t</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2778em"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em"></span><span class="mord mathnormal">a</span><span class="mclose">]</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.5em"><span></span></span></span></span></span></span></span><span class="tag"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1em"><span style="top:-3em"><span class="pstrut" style="height:2.84em"></span><span class="eqn-num"></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.5em"><span></span></span></span></span></span></span></span></span>
<p>If we knew the value of each share, we could easily solve <strong>MAB</strong>, just choose always the share with the highest value. However, we do not know these values, we need to estimate them. We denote the estimated value of action <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>a</mi></mrow><annotation encoding="application/x-tex">a</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.4306em"></span><span class="mord mathnormal">a</span></span></span></span> at time <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>t</mi></mrow><annotation encoding="application/x-tex">t</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6151em"></span><span class="mord mathnormal">t</span></span></span></span> as <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>Q</mi><mi>t</mi></msub><mo stretchy="false">(</mo><mi>a</mi><mo stretchy="false">)</mo><mo>≈</mo><msub><mi>q</mi><mo>∗</mo></msub><mo stretchy="false">(</mo><mi>a</mi><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">Q_t(a) \approx q_*(a)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em"></span><span class="mord"><span class="mord mathnormal">Q</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.2806em"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">t</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span><span class="mopen">(</span><span class="mord mathnormal">a</span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2778em"></span><span class="mrel">≈</span><span class="mspace" style="margin-right:0.2778em"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.03588em">q</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1757em"><span style="top:-2.55em;margin-left:-0.0359em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mbin mtight">∗</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span><span class="mopen">(</span><span class="mord mathnormal">a</span><span class="mclose">)</span></span></span></span>. At each time step <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>t</mi></mrow><annotation encoding="application/x-tex">t</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6151em"></span><span class="mord mathnormal">t</span></span></span></span> there is at least one action that has the highest value. If we choose another, worse action, we explore. In the short term, we lose by performing this worse action, but we hope that in the long term we will accumulate a greater cumulative value of rewards if it turns out that one of these worse actions is better than the current best one. By accumulating knowledge about the value of rewards for each action, we become more and more convinced which action is the best by exploring and using the knowledge we have already acquired. An interesting issue is when and how often to explore and use the acquired knowledge. The simplest method uses a predetermined value of <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>ϵ</mi></mrow><annotation encoding="application/x-tex">\epsilon</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.4306em"></span><span class="mord mathnormal">ϵ</span></span></span></span>. At each step, with probability <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mn>1</mn><mo>−</mo><mi>ϵ</mi></mrow><annotation encoding="application/x-tex">1 - \epsilon</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.7278em;vertical-align:-0.0833em"></span><span class="mord">1</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em"></span></span><span class="base"><span class="strut" style="height:0.4306em"></span><span class="mord mathnormal">ϵ</span></span></span></span>, we choose the best action (knowledge exploitation phase, <em>exploitation</em>). We select the best action by simply calculating the simple arithmetic average of the accumulated prizes for each action separately and selecting the one with the highest value. However, with probability <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>ϵ</mi></mrow><annotation encoding="application/x-tex">\epsilon</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.4306em"></span><span class="mord mathnormal">ϵ</span></span></span></span>, exploration takes place, during which we randomly select any action. Regardless of whether we explore or not, we save the received reward assigned to the selected action.</p>
<h2 class="anchor anchorWithStickyNavbar_iC02" id="time-for-the-code">Time for the code<a class="hash-link" aria-label="Direct link to Time for the code" title="Direct link to Time for the code" href="https://dloranc.github.io/2017/04/29/attack-of-multi-armed-bandits#time-for-the-code">​</a></h2>
<p>I don't know if I described everything clearly above, but I think the code should clarify the matter more. The following example contains a <code>Bandit</code> class that executes <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>ϵ</mi></mrow><annotation encoding="application/x-tex">\epsilon</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.4306em"></span><span class="mord mathnormal">ϵ</span></span></span></span>-greedy strategy for the three parameters given in the constructor. We can set the algorithm for how many actions, for how many time steps and for which epsilon. The constructor also initializes the <code>true_reward</code> variable with random values, and then, when the algorithm is executed, the rewards are returned with the <code>true_reward</code> value plus some noise (to make things easier). The whole essence of <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>ϵ</mi></mrow><annotation encoding="application/x-tex">\epsilon</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.4306em"></span><span class="mord mathnormal">ϵ</span></span></span></span>-greedy strategy is contained in the <code>choose_action</code> method. There, exploration/exploitation of knowledge takes place with the help of epsilon.</p>
<p>The main code (under <code>__main__</code>) contains an example of a single run of the algorithm and, perhaps more interestingly, a comparison of three different values ​​of <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>ϵ</mi></mrow><annotation encoding="application/x-tex">\epsilon</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.4306em"></span><span class="mord mathnormal">ϵ</span></span></span></span> (0, 0.1, 0.01). I performed 2000 runs for each of them and averaged the results, so you can see in the chart below what the average rewards look like for a given epsilon depending on the number of iterations.</p>
<p><img decoding="async" loading="lazy" alt="Plot for different epsilons" src="https://dloranc.github.io/assets/images/plot-f572e76ed9ef7329884c23826088bffb.png" title="Plot for different epsilons" width="640" height="480" class="img_r5VP"></p>
<p>You can see that for a value of 0.1, the optimal action is found quickly, but will never be selected more than 91% of the time. For a value of 0.01, the optimal action is found more slowly, but in the long run it will achieve better average results.</p>
<p>Okay, here's the code:</p>
<div class="language-python codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-python codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token triple-quoted-string string" style="color:#e3116c">'''</span><br></span><span class="token-line" style="color:#393A34"><span class="token triple-quoted-string string" style="color:#e3116c">Multi-armed bandit with e-greedy strategy</span><br></span><span class="token-line" style="color:#393A34"><span class="token triple-quoted-string string" style="color:#e3116c">With saving all rewards for each arm</span><br></span><span class="token-line" style="color:#393A34"><span class="token triple-quoted-string string" style="color:#e3116c">'''</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">import</span><span class="token plain"> numpy </span><span class="token keyword" style="color:#00009f">as</span><span class="token plain"> np</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">import</span><span class="token plain"> matplotlib</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">pyplot </span><span class="token keyword" style="color:#00009f">as</span><span class="token plain"> plt</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">from</span><span class="token plain"> random </span><span class="token keyword" style="color:#00009f">import</span><span class="token plain"> randint</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">import</span><span class="token plain"> random</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">class</span><span class="token plain"> </span><span class="token class-name">Bandit</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">def</span><span class="token plain"> </span><span class="token function" style="color:#d73a49">__init__</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">self</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> arms</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> pulls</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> epsilon</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">arms </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> arms</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">pulls </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> pulls</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">epsilon </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> epsilon</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">history </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">[</span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token comment" style="color:#999988;font-style:italic"># random values from normal distribution</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">true_reward </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">[</span><span class="token plain">np</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">random</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">randn</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">for</span><span class="token plain"> _ </span><span class="token keyword" style="color:#00009f">in</span><span class="token plain"> </span><span class="token builtin">range</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">arms</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">rewards </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">[</span><span class="token punctuation" style="color:#393A34">[</span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">for</span><span class="token plain"> _ </span><span class="token keyword" style="color:#00009f">in</span><span class="token plain"> </span><span class="token builtin">xrange</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">arms</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">def</span><span class="token plain"> </span><span class="token function" style="color:#d73a49">get_means</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">self</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        means </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> np</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">zeros</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">arms</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token keyword" style="color:#00009f">for</span><span class="token plain"> index</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> action_rewards </span><span class="token keyword" style="color:#00009f">in</span><span class="token plain"> </span><span class="token builtin">zip</span><span class="token punctuation" style="color:#393A34">(</span><span class="token builtin">range</span><span class="token punctuation" style="color:#393A34">(</span><span class="token builtin">len</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">means</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">rewards</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> </span><span class="token builtin">len</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">action_rewards</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">&gt;</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">0</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                means</span><span class="token punctuation" style="color:#393A34">[</span><span class="token plain">index</span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token builtin">sum</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">action_rewards</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">/</span><span class="token plain"> </span><span class="token builtin">len</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">action_rewards</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token keyword" style="color:#00009f">return</span><span class="token plain"> means</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">def</span><span class="token plain"> </span><span class="token function" style="color:#d73a49">choose_action</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">self</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        rand </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> np</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">random</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">uniform</span><span class="token punctuation" style="color:#393A34">(</span><span class="token number" style="color:#36acaa">0</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">1</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token comment" style="color:#999988;font-style:italic"># select action with 1 - epsilon probability</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> rand </span><span class="token operator" style="color:#393A34">&gt;</span><span class="token plain"> self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">epsilon</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token comment" style="color:#999988;font-style:italic"># exploit</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            means </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">get_means</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain">  </span><span class="token comment" style="color:#999988;font-style:italic"># compute all means</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            argmax </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> np</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">argmax</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">means</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token comment" style="color:#999988;font-style:italic"># select arm with best estimated reward</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token keyword" style="color:#00009f">return</span><span class="token plain"> argmax</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token keyword" style="color:#00009f">else</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token comment" style="color:#999988;font-style:italic"># explore</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token keyword" style="color:#00009f">return</span><span class="token plain"> randint</span><span class="token punctuation" style="color:#393A34">(</span><span class="token number" style="color:#36acaa">0</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token builtin">len</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">rewards</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">-</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">1</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">def</span><span class="token plain"> </span><span class="token function" style="color:#d73a49">get_reward</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">self</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> action</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token keyword" style="color:#00009f">return</span><span class="token plain"> self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">true_reward</span><span class="token punctuation" style="color:#393A34">[</span><span class="token plain">action</span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">+</span><span class="token plain"> np</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">random</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">randn</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token comment" style="color:#999988;font-style:italic"># true reward with noise</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">def</span><span class="token plain"> </span><span class="token function" style="color:#d73a49">save_reward</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">self</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> action</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> reward</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">rewards</span><span class="token punctuation" style="color:#393A34">[</span><span class="token plain">action</span><span class="token punctuation" style="color:#393A34">]</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">append</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">reward</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">def</span><span class="token plain"> </span><span class="token function" style="color:#d73a49">run</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">self</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token keyword" style="color:#00009f">for</span><span class="token plain"> t </span><span class="token keyword" style="color:#00009f">in</span><span class="token plain"> </span><span class="token builtin">range</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">pulls</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            action </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">choose_action</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            reward </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">get_reward</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">action</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">save_reward</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">action</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> reward</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">history</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">append</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">reward</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> __name__ </span><span class="token operator" style="color:#393A34">==</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">'__main__'</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token comment" style="color:#999988;font-style:italic"># example bandit</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    bandit </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> Bandit</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">arms</span><span class="token operator" style="color:#393A34">=</span><span class="token number" style="color:#36acaa">10</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> pulls</span><span class="token operator" style="color:#393A34">=</span><span class="token number" style="color:#36acaa">2000</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> epsilon</span><span class="token operator" style="color:#393A34">=</span><span class="token number" style="color:#36acaa">0.01</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    bandit</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">run</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">for</span><span class="token plain"> arm</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> reward</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> true_reward </span><span class="token keyword" style="color:#00009f">in</span><span class="token plain"> </span><span class="token builtin">zip</span><span class="token punctuation" style="color:#393A34">(</span><span class="token builtin">range</span><span class="token punctuation" style="color:#393A34">(</span><span class="token number" style="color:#36acaa">1</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token builtin">len</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">bandit</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">rewards</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">+</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">1</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                                        bandit</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">rewards</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> bandit</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">true_reward</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        pulls </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token builtin">len</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">reward</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token keyword" style="color:#00009f">print</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">"Arm {} pulls: {}, true reward: {}"</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain"> \</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token builtin">format</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">arm</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> pulls</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> true_reward</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">print</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">"Best arm: {}"</span><span class="token punctuation" style="color:#393A34">.</span><span class="token builtin">format</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">np</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">argmax</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">bandit</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">true_reward</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">+</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">1</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token comment" style="color:#999988;font-style:italic"># experiments</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    pulls </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">1000</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    experiments </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">2000</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    epsilons </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">[</span><span class="token number" style="color:#36acaa">0.01</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">0.1</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">0</span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    mean_outcomes </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">[</span><span class="token plain">np</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">zeros</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">pulls</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">for</span><span class="token plain"> _ </span><span class="token keyword" style="color:#00009f">in</span><span class="token plain"> epsilons</span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">for</span><span class="token plain"> _ </span><span class="token keyword" style="color:#00009f">in</span><span class="token plain"> </span><span class="token builtin">range</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">experiments</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token keyword" style="color:#00009f">for</span><span class="token plain"> index</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> epsilon </span><span class="token keyword" style="color:#00009f">in</span><span class="token plain"> </span><span class="token builtin">zip</span><span class="token punctuation" style="color:#393A34">(</span><span class="token builtin">range</span><span class="token punctuation" style="color:#393A34">(</span><span class="token builtin">len</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">epsilons</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> epsilons</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            bandit </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> Bandit</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">arms</span><span class="token operator" style="color:#393A34">=</span><span class="token number" style="color:#36acaa">10</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> pulls</span><span class="token operator" style="color:#393A34">=</span><span class="token plain">pulls</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> epsilon</span><span class="token operator" style="color:#393A34">=</span><span class="token plain">epsilon</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            bandit</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">run</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            mean_outcomes</span><span class="token punctuation" style="color:#393A34">[</span><span class="token plain">index</span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">+=</span><span class="token plain"> bandit</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">history</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">for</span><span class="token plain"> index</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> epsilon </span><span class="token keyword" style="color:#00009f">in</span><span class="token plain"> </span><span class="token builtin">zip</span><span class="token punctuation" style="color:#393A34">(</span><span class="token builtin">range</span><span class="token punctuation" style="color:#393A34">(</span><span class="token builtin">len</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">epsilons</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> epsilons</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        mean_outcomes</span><span class="token punctuation" style="color:#393A34">[</span><span class="token plain">index</span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">/=</span><span class="token plain"> experiments</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        plt</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">plot</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">mean_outcomes</span><span class="token punctuation" style="color:#393A34">[</span><span class="token plain">index</span><span class="token punctuation" style="color:#393A34">]</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> label</span><span class="token operator" style="color:#393A34">=</span><span class="token string" style="color:#e3116c">"epsilon: "</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">+</span><span class="token plain"> </span><span class="token builtin">str</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">epsilon</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    plt</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">ylabel</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">"Average reward"</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    plt</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">xlabel</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">"Steps"</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    plt</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">legend</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    plt</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">savefig</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">'01_plot.png'</span><span class="token punctuation" style="color:#393A34">)</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>Sample output:</p>
<div class="language-shell codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-shell codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token plain">Arm </span><span class="token number" style="color:#36acaa">1</span><span class="token plain">	pulls: </span><span class="token number" style="color:#36acaa">3</span><span class="token plain">,		</span><span class="token boolean" style="color:#36acaa">true</span><span class="token plain"> reward: </span><span class="token parameter variable" style="color:#36acaa">-0.903469191365</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">Arm </span><span class="token number" style="color:#36acaa">2</span><span class="token plain">	pulls: </span><span class="token number" style="color:#36acaa">3</span><span class="token plain">,		</span><span class="token boolean" style="color:#36acaa">true</span><span class="token plain"> reward: </span><span class="token number" style="color:#36acaa">0.365839293594</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">Arm </span><span class="token number" style="color:#36acaa">3</span><span class="token plain">	pulls: </span><span class="token number" style="color:#36acaa">3</span><span class="token plain">,		</span><span class="token boolean" style="color:#36acaa">true</span><span class="token plain"> reward: </span><span class="token parameter variable" style="color:#36acaa">-0.854871239295</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">Arm </span><span class="token number" style="color:#36acaa">4</span><span class="token plain">	pulls: </span><span class="token number" style="color:#36acaa">2</span><span class="token plain">,		</span><span class="token boolean" style="color:#36acaa">true</span><span class="token plain"> reward: </span><span class="token parameter variable" style="color:#36acaa">-0.445679248867</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">Arm </span><span class="token number" style="color:#36acaa">5</span><span class="token plain">	pulls: </span><span class="token number" style="color:#36acaa">94</span><span class="token plain">,		</span><span class="token boolean" style="color:#36acaa">true</span><span class="token plain"> reward: </span><span class="token number" style="color:#36acaa">1.0921733926</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">Arm </span><span class="token number" style="color:#36acaa">6</span><span class="token plain">	pulls: </span><span class="token number" style="color:#36acaa">3</span><span class="token plain">,		</span><span class="token boolean" style="color:#36acaa">true</span><span class="token plain"> reward: </span><span class="token parameter variable" style="color:#36acaa">-0.123881634804</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">Arm </span><span class="token number" style="color:#36acaa">7</span><span class="token plain">	pulls: </span><span class="token number" style="color:#36acaa">0</span><span class="token plain">,		</span><span class="token boolean" style="color:#36acaa">true</span><span class="token plain"> reward: </span><span class="token parameter variable" style="color:#36acaa">-0.928756860211</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">Arm </span><span class="token number" style="color:#36acaa">8</span><span class="token plain">	pulls: </span><span class="token number" style="color:#36acaa">5</span><span class="token plain">,		</span><span class="token boolean" style="color:#36acaa">true</span><span class="token plain"> reward: </span><span class="token number" style="color:#36acaa">0.860238065648</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">Arm </span><span class="token number" style="color:#36acaa">9</span><span class="token plain">	pulls: </span><span class="token number" style="color:#36acaa">1885</span><span class="token plain">,	</span><span class="token boolean" style="color:#36acaa">true</span><span class="token plain"> reward: </span><span class="token number" style="color:#36acaa">1.81443343678</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">Arm </span><span class="token number" style="color:#36acaa">10</span><span class="token plain">	pulls: </span><span class="token number" style="color:#36acaa">2</span><span class="token plain">,		</span><span class="token boolean" style="color:#36acaa">true</span><span class="token plain"> reward: </span><span class="token number" style="color:#36acaa">0.247351866388</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>It's not always rosy:</p>
<div class="language-shell codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-shell codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token plain">Arm </span><span class="token number" style="color:#36acaa">1</span><span class="token plain"> pulls: </span><span class="token number" style="color:#36acaa">1244</span><span class="token plain">, </span><span class="token boolean" style="color:#36acaa">true</span><span class="token plain"> reward: </span><span class="token number" style="color:#36acaa">0.865701931312</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">Arm </span><span class="token number" style="color:#36acaa">2</span><span class="token plain"> pulls: </span><span class="token number" style="color:#36acaa">1</span><span class="token plain">, </span><span class="token boolean" style="color:#36acaa">true</span><span class="token plain"> reward: </span><span class="token parameter variable" style="color:#36acaa">-0.0986266557818</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">Arm </span><span class="token number" style="color:#36acaa">3</span><span class="token plain"> pulls: </span><span class="token number" style="color:#36acaa">2</span><span class="token plain">, </span><span class="token boolean" style="color:#36acaa">true</span><span class="token plain"> reward: </span><span class="token parameter variable" style="color:#36acaa">-0.93574271516</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">Arm </span><span class="token number" style="color:#36acaa">4</span><span class="token plain"> pulls: </span><span class="token number" style="color:#36acaa">4</span><span class="token plain">, </span><span class="token boolean" style="color:#36acaa">true</span><span class="token plain"> reward: </span><span class="token number" style="color:#36acaa">0.273764997199</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">Arm </span><span class="token number" style="color:#36acaa">5</span><span class="token plain"> pulls: </span><span class="token number" style="color:#36acaa">2</span><span class="token plain">, </span><span class="token boolean" style="color:#36acaa">true</span><span class="token plain"> reward: </span><span class="token parameter variable" style="color:#36acaa">-2.24599693076</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">Arm </span><span class="token number" style="color:#36acaa">6</span><span class="token plain"> pulls: </span><span class="token number" style="color:#36acaa">3</span><span class="token plain">, </span><span class="token boolean" style="color:#36acaa">true</span><span class="token plain"> reward: </span><span class="token parameter variable" style="color:#36acaa">-0.0837555511977</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">Arm </span><span class="token number" style="color:#36acaa">7</span><span class="token plain"> pulls: </span><span class="token number" style="color:#36acaa">208</span><span class="token plain">, </span><span class="token boolean" style="color:#36acaa">true</span><span class="token plain"> reward: </span><span class="token number" style="color:#36acaa">0.679058209084</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">Arm </span><span class="token number" style="color:#36acaa">8</span><span class="token plain"> pulls: </span><span class="token number" style="color:#36acaa">2</span><span class="token plain">, </span><span class="token boolean" style="color:#36acaa">true</span><span class="token plain"> reward: </span><span class="token parameter variable" style="color:#36acaa">-0.983193383821</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">Arm </span><span class="token number" style="color:#36acaa">9</span><span class="token plain"> pulls: </span><span class="token number" style="color:#36acaa">2</span><span class="token plain">, </span><span class="token boolean" style="color:#36acaa">true</span><span class="token plain"> reward: </span><span class="token parameter variable" style="color:#36acaa">-0.0941783631123</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">Arm </span><span class="token number" style="color:#36acaa">10</span><span class="token plain"> pulls: </span><span class="token number" style="color:#36acaa">532</span><span class="token plain">, </span><span class="token boolean" style="color:#36acaa">true</span><span class="token plain"> reward: </span><span class="token number" style="color:#36acaa">1.78310799226</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<h2 class="anchor anchorWithStickyNavbar_iC02" id="summary">Summary<a class="hash-link" aria-label="Direct link to Summary" title="Direct link to Summary" href="https://dloranc.github.io/2017/04/29/attack-of-multi-armed-bandits#summary">​</a></h2>
<p>I presented the simplest strategy for solving the <strong>multi-armed bandit problem</strong> known as <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>ϵ</mi></mrow><annotation encoding="application/x-tex">\epsilon</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.4306em"></span><span class="mord mathnormal">ϵ</span></span></span></span>-greedy strategy. Of course, there is much more. You can imagine the value of <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>ϵ</mi></mrow><annotation encoding="application/x-tex">\epsilon</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.4306em"></span><span class="mord mathnormal">ϵ</span></span></span></span> decreasing over time, an algorithm with an exploration phase first and then using the acquired knowledge, and many, many other strategies. But more on that in subsequent posts.</p>]]></content>
        <category label="Sutton & Barto" term="Sutton & Barto"/>
        <category label="DSP 2017" term="DSP 2017"/>
        <category label="Python" term="Python"/>
        <category label="Multi Armed Bandit" term="Multi Armed Bandit"/>
        <category label="Reinforcement learning" term="Reinforcement learning"/>
    </entry>
    <entry>
        <title type="html"><![CDATA[Participation in SSCAIT]]></title>
        <id>https://dloranc.github.io/2017/04/23/participation-in-sscait</id>
        <link href="https://dloranc.github.io/2017/04/23/participation-in-sscait"/>
        <updated>2017-04-23T00:00:00.000Z</updated>
        <summary type="html"><![CDATA[A post about my participation in SSCAIT and further improvements to the bot.]]></summary>
        <content type="html"><![CDATA[<p><em>This post is about the Starcraft bot I am developing using machine learning. The project is being developed as part of the "Daj Się Poznać 2017" competition.</em></p>
<hr>
<p>So far, I have been testing my bot on my computer against the default bot created by Blizzard. It's time to test yourself against other bot creators. I know that instead of the standard bot, you can connect another one using <strong>Chaoslauncher - MultiInstance</strong> and editing <code>bwapi.ini</code>, but I preferred to see first what the registration procedure in <a href="http://sscaitournament.com/" target="_blank" rel="noopener noreferrer">SSCAIT</a> looks like.</p>
<p>To register a bot in <strong>SSCAIT</strong>, in addition to providing various data, you need to generate a <code>.jar</code> file and insert it into the ZIP together with <code>BWAPI.dll</code>, the DLL responsible for connecting to Starcraft. Each bot can use a different version of BWAPI, so you need to include your DLL. Without thinking much, I generated <code>artifact</code> in IntelliJ IDEA, but after registration it turned out that my bot crashed six times and was blocked. I did something wrong and I didn't really know what. However, I noticed that in the results list there are links to download bots. I downloaded one written in Java, changed the extension from <code>.jar</code> to <code>.zip</code> and checked the file structure. It should look like this:</p>
<p><img decoding="async" loading="lazy" alt="artifacts" src="https://dloranc.github.io/assets/images/artifacts-5d461311f55d41f5ec76ff5a0f422364.png" title="artifacts" width="1019" height="305" class="img_r5VP"></p>
<p>In particular, the <code>META-INF</code> directory containing the <code>MANIFEST.MF</code> file with a defined entrypoint is necessary, so that Java knows which class is the main one with the <code>main</code> method.</p>
<div class="language-ini codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-ini codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token plain">Manifest-Version: 1.0</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">Main-Class: dloranc.fivepoolbot.FivePoolBot</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>It took me a while, I'm sharing it for the future, because the tutorial on the SSCAIT website didn't contain much information about creating a working package.</p>
<p>As for my results in SSCAIT, after a few days I have 15 games won and 20 lost (I subtracted crashes). In my opinion, not bad for such a simple bot, especially since I was ahead of 15 bots in the <a href="http://sscaitournament.com/index.php?action=scores" target="_blank" rel="noopener noreferrer">scorelist</a>.</p>
<h2 class="anchor anchorWithStickyNavbar_iC02" id="serious-bug-in-the-code-from-the-tutorial">Serious bug in the code from the tutorial<a class="hash-link" aria-label="Direct link to Serious bug in the code from the tutorial" title="Direct link to Serious bug in the code from the tutorial" href="https://dloranc.github.io/2017/04/23/participation-in-sscait#serious-bug-in-the-code-from-the-tutorial">​</a></h2>
<p>While watching the replays, I noticed that my bot was behaving incorrectly. Sometimes, for unexplained reasons, he was unable to detect enemy buildings, which resulted in the attack going to the wrong location, which meant losing the game. After quite a lot of thinking, I managed to find out that the problem was not remembering the location of the enemy's buildings correctly. I even managed to visualize it. You can see this in the gif below:</p>
<p><img decoding="async" loading="lazy" alt="Incorrect operation" src="https://dloranc.github.io/assets/images/invalid_rescaled-d1109e8fb51113145a6d442a14123a1a.gif" title="Incorrect operation" width="605" height="284" class="img_r5VP"></p>
<p>Very interesting, because I copied the code for this solution from the tutorial on the SSCAIT website and assumed that everything worked. But no - some buildings did not have a red dot or it disappeared every frame, which resulted in the dot flashing on the screen.</p>
<p>A riddle for you: where is the error in the code below?</p>
<div class="language-Java language-java codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-java codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token comment" style="color:#999988;font-style:italic">//always loop over all currently visible enemy units (even though this set is usually empty)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">for</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token class-name">Unit</span><span class="token plain"> u </span><span class="token operator" style="color:#393A34">:</span><span class="token plain"> game</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">enemy</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">getUnits</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">	</span><span class="token comment" style="color:#999988;font-style:italic">//if this unit is in fact a building</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">	</span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">u</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">getType</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">isBuilding</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">		</span><span class="token comment" style="color:#999988;font-style:italic">//check if we have it's position in memory and add it if we don't</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">		</span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token operator" style="color:#393A34">!</span><span class="token plain">enemyBuildingMemory</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">contains</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">u</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">getPosition</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> enemyBuildingMemory</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">add</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">u</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">getPosition</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">	</span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token comment" style="color:#999988;font-style:italic">//loop over all the positions that we remember</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">for</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token class-name">Position</span><span class="token plain"> p </span><span class="token operator" style="color:#393A34">:</span><span class="token plain"> enemyBuildingMemory</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">	</span><span class="token comment" style="color:#999988;font-style:italic">// compute the TilePosition corresponding to our remembered Position p</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">	</span><span class="token class-name">TilePosition</span><span class="token plain"> tileCorrespondingToP </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">new</span><span class="token plain"> </span><span class="token class-name">TilePosition</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">p</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">getX</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token operator" style="color:#393A34">/</span><span class="token number" style="color:#36acaa">32</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> p</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">getY</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token operator" style="color:#393A34">/</span><span class="token number" style="color:#36acaa">32</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">	</span><span class="token comment" style="color:#999988;font-style:italic">//if that tile is currently visible to us...</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">	</span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">game</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">isVisible</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">tileCorrespondingToP</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">		</span><span class="token comment" style="color:#999988;font-style:italic">//loop over all the visible enemy buildings and find out if at least</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">		</span><span class="token comment" style="color:#999988;font-style:italic">//one of them is still at that remembered position</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">		</span><span class="token keyword" style="color:#00009f">boolean</span><span class="token plain"> buildingStillThere </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token boolean" style="color:#36acaa">false</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">		</span><span class="token keyword" style="color:#00009f">for</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token class-name">Unit</span><span class="token plain"> u </span><span class="token operator" style="color:#393A34">:</span><span class="token plain"> game</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">enemy</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">getUnits</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">			</span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">u</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">getType</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">isBuilding</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">&amp;&amp;</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">u</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">getPosition</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">==</span><span class="token plain"> p</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">				buildingStillThere </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token boolean" style="color:#36acaa">true</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">				</span><span class="token keyword" style="color:#00009f">break</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">			</span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">		</span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">		</span><span class="token comment" style="color:#999988;font-style:italic">//if there is no more any building, remove that position from our memory</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">		</span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">buildingStillThere </span><span class="token operator" style="color:#393A34">==</span><span class="token plain"> </span><span class="token boolean" style="color:#36acaa">false</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">			enemyBuildingMemory</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">remove</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">p</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">			</span><span class="token keyword" style="color:#00009f">break</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">		</span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">	</span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token punctuation" style="color:#393A34">}</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>It took me a lot of time, and the error was trivial and consisted in the following line:</p>
<div class="language-Java language-java codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-java codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">u</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">getType</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">isBuilding</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">&amp;&amp;</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">u</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">getPosition</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">==</span><span class="token plain"> p</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>I replaced it with (by the way I got rid of the excess brackets):</p>
<div class="language-Java language-java codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-java codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">u</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">getType</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">isBuilding</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">&amp;&amp;</span><span class="token plain"> u</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">getPosition</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">equals</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">p</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>Well, in Java, if we want to compare two objects, we use <code>equals()</code> because the <code>==</code> operator compares references to objects.</p>
<p>After the fix, nothing flashes:</p>
<p><img decoding="async" loading="lazy" alt="Correct action" src="https://dloranc.github.io/assets/images/valid_rescaled-3c2857bc1ea7e3273f35d6a093774276.gif" title="Correct action" width="605" height="284" class="img_r5VP"></p>]]></content>
        <category label="Projects" term="Projects"/>
        <category label="Java" term="Java"/>
        <category label="DSP 2017" term="DSP 2017"/>
        <category label="Starcraft" term="Starcraft"/>
    </entry>
    <entry>
        <title type="html"><![CDATA[Learning BWAPI]]></title>
        <id>https://dloranc.github.io/2017/04/16/learning-bwapi</id>
        <link href="https://dloranc.github.io/2017/04/16/learning-bwapi"/>
        <updated>2017-04-16T00:00:00.000Z</updated>
        <summary type="html"><![CDATA[We fine-tune the bot, learn BWAPI, its debugging functions and other useful things.]]></summary>
        <content type="html"><![CDATA[<p><em>This post is about the Starcraft bot I am developing using machine learning. The project is being developed as part of the "Daj Się Poznać 2017" competition.</em></p>
<hr>
<p>I spent this week polishing up my Starcraft bot. I didn't want to take up reinforcement learning yet because I preferred to spend my time learning about BWAPI. The code can be viewed here in my repository <a href="https://github.com/dloranc/five-pool-bot" target="_blank" rel="noopener noreferrer">dloranc/five-pool-bot</a>.</p>
<h2 class="anchor anchorWithStickyNavbar_iC02" id="fixes">Fixes<a class="hash-link" aria-label="Direct link to Fixes" title="Direct link to Fixes" href="https://dloranc.github.io/2017/04/16/learning-bwapi#fixes">​</a></h2>
<p>In a previous post I wrote a to-do list. Some of them were successfully completed (these are crossed out below):</p>
<ul>
<li>Mineral gathering can be optimized according to this topic on <a href="http://www.teamliquid.net/forum/brood-war/484849-improving-mineral-gathering-rate-in-brood-war" target="_blank" rel="noopener noreferrer">TeamLiquid</a>.</li>
<li>
<strike>The scout should be a created drone, not one drawn from the initial ones.</strike>
</li>
<li>
<strike>If a drone intended for building intends to build, it should collect minerals if it has collected any before building. Now it's like that if she had any, they are lost.</strike>
</li>
<li>
<strike>Scout may be more optimal, the drone should go to the bases that are closest to them.</strike>
</li>
<li>
<strike>Sometimes when a drone is going to the last base on a large map and zerglings are produced, they move to random places on the map because they don't know where the opponent's base is. This can be improved by sending them to the base where the drone is heading, because then you know that this is the right base.</strike>
</li>
<li>Zerglings can fight better, you can apply priorities on what to attack first. It would also be useful to withdraw severely wounded units to regenerate.</li>
<li>
<strike>If the base and buildings around it are destroyed and the game is not over, it means that there is a building somewhere on the map that needs to be destroyed. Zerglings don't even search randomly, they just gather in one place.</strike>
</li>
<li>In general, it would be useful to write some class that allows you to give orders to units and cancel them when certain circumstances arise.</li>
</ul>
<p>I especially had an interesting experience with bringing minerals back by drone before building the <strong>Spawning Pool</strong>. The sample logic should look something like this:</p>
<div class="language-Java language-java codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-java codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">buildDrone</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">isCarryingMinerals</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">	buildDrone</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">returnCargo</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">else</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">	</span><span class="token comment" style="color:#999988;font-style:italic">// ...</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">	buildDrone</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">build</span><span class="token punctuation" style="color:#393A34">(</span><span class="token class-name">UnitType</span><span class="token class-name punctuation" style="color:#393A34">.</span><span class="token class-name">Zerg_Spawning_Pool</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> buildPosition</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token punctuation" style="color:#393A34">}</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>But I found that sometimes it doesn't work. Looking at the <a href="https://bwapi.github.io/class_b_w_a_p_i_1_1_unitset.html#a0b24b5f25b609169c0fafbe70d2f60aa" target="_blank" rel="noopener noreferrer">BWAPI documentation</a> of the <code>returnCargo</code> function, I noticed the following comment:</p>
<blockquote>
<p>There is a small chance for a command to fail after it has been passed to Broodwar.</p>
</blockquote>
<p>What's interesting is that at this point in the documentation this note is under almost every function. I wonder what causes the inability to perform a given action? And are you sure there wasn't some problem with my code?</p>
<p>I solved this problem differently, but not entirely optimally. It is possible that all drones will have minerals and the building algorithm must wait until one of them brings the minerals to the base. I don't know if such a small waste of time can sometimes be crucial to winning the game. I don't think so, but I like to have everything 100%.</p>
<div class="language-Java language-java codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-java codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token class-name">Unit</span><span class="token plain"> buildDrone </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">null</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">for</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token class-name">Unit</span><span class="token plain"> myUnit </span><span class="token operator" style="color:#393A34">:</span><span class="token plain"> self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">getUnits</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">myUnit</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">getType</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">==</span><span class="token plain"> </span><span class="token class-name">UnitType</span><span class="token class-name punctuation" style="color:#393A34">.</span><span class="token class-name">Zerg_Drone</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token operator" style="color:#393A34">!</span><span class="token plain">myUnit</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">isCarryingMinerals</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">&amp;&amp;</span><span class="token plain"> myUnit</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">getID</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">!=</span><span class="token plain"> scoutDrone</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">getID</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            buildDrone </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> myUnit</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token keyword" style="color:#00009f">break</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token punctuation" style="color:#393A34">}</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<h2 class="anchor anchorWithStickyNavbar_iC02" id="drawing-auxiliary-figures">Drawing auxiliary figures<a class="hash-link" aria-label="Direct link to Drawing auxiliary figures" title="Direct link to Drawing auxiliary figures" href="https://dloranc.github.io/2017/04/16/learning-bwapi#drawing-auxiliary-figures">​</a></h2>
<p><strong>BWAPI</strong> provides <a href="https://bwapi.github.io/class_b_w_a_p_i_1_1_game.html" target="_blank" rel="noopener noreferrer">a dozen or so cool methods for drawing</a> points, lines, squares, circles, ellipses, triangles, as well as for writing text on the screen.</p>
<p>Each method starts with the word <code>draw</code> and is called on an object of type <code>Game</code>. After the word <code>draw</code> we give what we want to draw (<code>Text</code>, <code>Box</code>, <code>Triangle</code>, <code>Dot</code>, <code>Circle</code>, <code>Ellipse</code>, <code>Line</code>), after which we can also add a word indicating how * <em>BWAPI</em>* is supposed to interpret the transmitted coordinates. If we want to draw a square on the map, we call the <code>drawBoxMap</code> method, if we want to draw on the screen, we call <code>drawBoxScreen</code>. The position <code>(0, 0)</code> will then be the upper left corner. We can also use <code>drawBoxMouse</code>, then the position <code>(0, 0)</code> will mean drawing a square where the cursor is.</p>
<div class="language-cpp codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-cpp codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token function" style="color:#d73a49">drawBox</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">CoordinateType</span><span class="token double-colon punctuation" style="color:#393A34">::</span><span class="token plain">Enum ctype</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">int</span><span class="token plain"> left</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">int</span><span class="token plain"> top</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">int</span><span class="token plain"> right</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">int</span><span class="token plain"> bottom</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> Color color</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">bool</span><span class="token plain"> isSolid</span><span class="token operator" style="color:#393A34">=</span><span class="token boolean" style="color:#36acaa">false</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token function" style="color:#d73a49">drawBoxMap</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">Position leftTop</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> Position rightBottom</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> Color color</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">bool</span><span class="token plain"> isSolid</span><span class="token operator" style="color:#393A34">=</span><span class="token boolean" style="color:#36acaa">false</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token function" style="color:#d73a49">drawBoxMouse</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">Position leftTop</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> Position rightBottom</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> Color color</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">bool</span><span class="token plain"> isSolid</span><span class="token operator" style="color:#393A34">=</span><span class="token boolean" style="color:#36acaa">false</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token function" style="color:#d73a49">drawBoxScreen</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">Position leftTop</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> Position rightBottom</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> Color color</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">bool</span><span class="token plain"> isSolid</span><span class="token operator" style="color:#393A34">=</span><span class="token boolean" style="color:#36acaa">false</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token comment" style="color:#999988;font-style:italic">// ...</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token function" style="color:#d73a49">drawLineMap</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">Position a</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> Position b</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> Color color</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token function" style="color:#d73a49">drawLineMouse</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">Position a</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> Position b</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> Color color</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token function" style="color:#d73a49">drawLineScreen</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">Position a</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> Position b</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> Color color</span><span class="token punctuation" style="color:#393A34">)</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>I used this to graphically visualize the orders (in red in the screenshot below) that the units carry out. I have marked chokepoints on the map, i.e. narrow places that are important for strategic reasons (yellow). I also drew the zones where land units cannot move (in gray). The only question is why I had to multiply all the positions of the polygon points by 8? I also marked the place where the enemy's base is (in the screenshot in the header) with a circle. This was useful for me when debugging scouting.</p>
<p><img decoding="async" loading="lazy" alt="Luxiliary lines" src="https://dloranc.github.io/assets/images/01-986467f9ddeb0c1751f9077c5716f5b3.jpg" title="Luxiliary lines" width="1276" height="950" class="img_r5VP"></p>
<h2 class="anchor anchorWithStickyNavbar_iC02" id="other-useful-things">Other useful things<a class="hash-link" aria-label="Direct link to Other useful things" title="Direct link to Other useful things" href="https://dloranc.github.io/2017/04/16/learning-bwapi#other-useful-things">​</a></h2>
<p>On <a href="http://www.starcraftai.com/wiki/Increasing_StarCraft_Speed" target="_blank" rel="noopener noreferrer">starcraftai.com</a> I discovered two ways to speed up the game. The first way is to completely turn off the sound engine, because normally even if we set the volume to zero, the functions that play sound and music will be called. Turning off the engine prevents these functions from being activated. To tell you the truth, I didn't notice the game speeding up much, but I turned it off just to be on the safe side.</p>
<div class="language-ini codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-ini codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token section punctuation" style="color:#393A34">[</span><span class="token section section-name selector" style="color:#00009f">starcraft</span><span class="token section punctuation" style="color:#393A34">]</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token comment" style="color:#999988;font-style:italic">; Game sound engine = ON | OFF</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token key attr-name" style="color:#00a4db">sound</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">=</span><span class="token plain"> </span><span class="token value attr-value" style="color:#e3116c">OFF</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>The second cool thing is the ability to turn off graphics rendering when the match starts. Well, almost, because from what I saw, sometimes it will render a frame.</p>
<div class="language-Java language-java codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-java codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token plain">game</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">setGUI</span><span class="token punctuation" style="color:#393A34">(</span><span class="token boolean" style="color:#36acaa">false</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>I discovered that it is also possible to run Starcraft in <strong>headless</strong> mode (without rendering any graphics) on the command line. This can be done using the <a href="https://github.com/tscmoo/bwheadless" target="_blank" rel="noopener noreferrer">tscmoo/bwheadless</a> project. I managed to compile it, but haven't tested it yet. It will definitely be useful. It's a pity that this project doesn't have at least <strong>README.md</strong>.</p>
<h2 class="anchor anchorWithStickyNavbar_iC02" id="whats-next">What's next?<a class="hash-link" aria-label="Direct link to What's next?" title="Direct link to What's next?" href="https://dloranc.github.io/2017/04/16/learning-bwapi#whats-next">​</a></h2>
<p>Now I will finally start reinforcement learning. My guess is that you can win most games, even with bots that have good micro, only with an early Zergling attack. With a perfect micro, especially Protoss may have big problems, because its basic unit, the <strong>Zealot</strong>, is very slow and is unable to keep up with the Zergligs. All I need is for my bot to focus on the buildings, avoiding the opponent as much as possible, and the game will be won. This is of course my hypothesis, we'll see how it comes out in the wash.</p>
<p>The question remains what to use. I tried TorchCraft, but it didn't solve my problems. TorchCraft uses the <strong>Lua</strong> language, and I would like to check my bot in <a href="http://sscaitournament.com/" target="_blank" rel="noopener noreferrer">SSCAIT</a>, and there you can only report bots in Java and C++. Maybe combine what I've already written with <a href="https://deeplearning4j.org/" target="_blank" rel="noopener noreferrer">deeplearning4j</a>? I think I'll do that, I want to check how much my bot will be worth.</p>]]></content>
        <category label="Projects" term="Projects"/>
        <category label="DSP 2017" term="DSP 2017"/>
        <category label="Starcraft" term="Starcraft"/>
        <category label="bwapi" term="bwapi"/>
        <category label="Java" term="Java"/>
    </entry>
    <entry>
        <title type="html"><![CDATA[Reinforcement learning - what is it?]]></title>
        <id>https://dloranc.github.io/2017/04/16/reinforcement-learning-what-is-it</id>
        <link href="https://dloranc.github.io/2017/04/16/reinforcement-learning-what-is-it"/>
        <updated>2017-04-16T00:00:00.000Z</updated>
        <summary type="html"><![CDATA[A post in which I explain what reinforcement learning is.]]></summary>
        <content type="html"><![CDATA[<h2 class="anchor anchorWithStickyNavbar_iC02" id="before-we-start">Before we start<a class="hash-link" aria-label="Direct link to Before we start" title="Direct link to Before we start" href="https://dloranc.github.io/2017/04/16/reinforcement-learning-what-is-it#before-we-start">​</a></h2>
<p>For some time now I have been trying to slowly read the book <a href="http://incompleteideas.net/sutton/book/the-book-2nd.html" target="_blank" rel="noopener noreferrer">"Reinforcement Learning: An Introduction"</a> by Richard S. Sutton and Andrew G. Barto. Someone asked for some good RL materials in one of my posts, so I'm sharing and recommending this book. This is supposedly a classic book in this field. In my opinion, it deserves this name, if one can say so after reading less than two chapters of this book.</p>
<p>To consolidate my knowledge, I decided to write code for the algorithms found in this book. There are sample codes on the website linked above, but I don't intend to use them (maybe if I get completely stuck on something). Repository (empty for now): <a href="https://github.com/dloranc/reinforcement-learning-an-introduction" target="_blank" rel="noopener noreferrer">dloranc/reinforcement-learning-an-introduction</a>.</p>
<h2 class="anchor anchorWithStickyNavbar_iC02" id="reinforcement-learning---what-is-it">Reinforcement Learning - what is it?<a class="hash-link" aria-label="Direct link to Reinforcement Learning - what is it?" title="Direct link to Reinforcement Learning - what is it?" href="https://dloranc.github.io/2017/04/16/reinforcement-learning-what-is-it#reinforcement-learning---what-is-it">​</a></h2>
<p>Reinforcement Learning is a certain way of solving some problems that cannot be solved in a simple way (analytically) or we do not have a good model. Something, we'll call it an <strong>agent</strong>, performs certain <strong>actions</strong> in some unknown <strong>environment</strong> that is in a certain <strong>state</strong>. For taking action, the agent receives a reinforcement signal. It can be positive (<strong>reward</strong>) or negative (<strong>punishment</strong>). By interacting with the environment, an agent learns a certain <strong>politics</strong>. The goal of reinforcement learning is to determine the optimal <strong>policy</strong> for which we will receive as many rewards as possible. Finding such a policy is not easy. Sometimes, for example, it is worth sacrificing rewards to gain in the long run.</p>
<p><img decoding="async" loading="lazy" alt="Reinforcement learning - schema" src="https://dloranc.github.io/assets/images/reinforcement_learning_english-9e56ce708d7d62a92d15360f9171c7d3.svg" title="Reinforcement learning - schema" width="809" height="658" class="img_r5VP"></p>
<p>It is a way of teaching inspired by the achievements of <a href="https://en.wikipedia.org/wiki/Reinforcement" target="_blank" rel="noopener noreferrer">behavioral psychology</a>. Animals learn about the world this way (humans are more complicated). For example, imagine a small dog learning to walk. Trying to learn this task, he performs various actions by trial and error. Thanks to his senses, he receives certain stimuli from the world and takes certain actions depending on them. If they don't fail, they will perform good actions that were successful more often and better (positive reinforcement). If it fails the floor, it will try not to take actions that would result in this result next time. And so on until he learns to walk.</p>
<p>In the table below I have provided examples of various agents, environments, etc. etc.:</p>
<table><thead><tr><th>Agent</th><th>Environment</th><th>State</th><th>Action</th><th>Policy wanted</th></tr></thead><tbody><tr><td>animal</td><td>world</td><td>state of the world, the position in which the animal is</td><td>limb movements</td><td>learning to walk</td></tr><tr><td>robot</td><td>factory</td><td>data from sensors, position of manipulators</td><td>movement of manipulators</td><td>sorting items</td></tr><tr><td>bot playing Go</td><td>board</td><td>current position on the board</td><td>placing a stone on the board</td><td>winning as many games as possible</td></tr></tbody></table>
<p>But it doesn't matter what the tables are, let's see it in real projects. I've put together a playlist of some cool examples below:</p>
<iframe src="https://www.youtube.com/embed/SH3bADiB7uQ?list=PL5nBAYUyJTrM48dViibyi68urttMlUv7e" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen=""></iframe>
<p>I find this pancake-tossing robot particularly cute:</p>
<iframe src="https://www.youtube.com/embed/W_gxLKSsSIE" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen=""></iframe>
<h2 class="anchor anchorWithStickyNavbar_iC02" id="how-does-reinforcement-learning-differ-from-other-methods">How does reinforcement learning differ from other methods?<a class="hash-link" aria-label="Direct link to How does reinforcement learning differ from other methods?" title="Direct link to How does reinforcement learning differ from other methods?" href="https://dloranc.github.io/2017/04/16/reinforcement-learning-what-is-it#how-does-reinforcement-learning-differ-from-other-methods">​</a></h2>
<p>For example, what is the difference between supervised learning and reinforcement learning? In supervised learning, we receive some data and it is already pre-classified. Based on them, the algorithm must learn how to distinguish them (classify) or how to predict a certain value based on them (regression). We don't have anything for starters in reinforcement learning. We need to find a solution based on performing actions in the environment and on rewards/punishments. In addition, there is the problem of exploring and using knowledge acquired through interaction with the environment. This does not occur in other machine learning paradigms. In unsupervised learning it is similar, we receive some data and we have to classify it somehow. However, we are not guided by rewards/punishments but by their structure.</p>]]></content>
        <category label="Sutton & Barto" term="Sutton & Barto"/>
        <category label="DSP 2017" term="DSP 2017"/>
        <category label="Reinforcement learning" term="Reinforcement learning"/>
    </entry>
    <entry>
        <title type="html"><![CDATA[Command line arguments in Python]]></title>
        <id>https://dloranc.github.io/2017/04/09/command-line-arguments-in-python</id>
        <link href="https://dloranc.github.io/2017/04/09/command-line-arguments-in-python"/>
        <updated>2017-04-09T00:00:00.000Z</updated>
        <summary type="html"><![CDATA[A post about how to handle arguments passed to a Python script from the command line. For this I used a very interesting library called docopt.]]></summary>
        <content type="html"><![CDATA[<p>Recently I was wondering how to implement arguments passed to the script from the command line in Python. You can do this very simply using <code>sys.argv</code> from the standard library:</p>
<div class="language-python codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-python codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token keyword" style="color:#00009f">import</span><span class="token plain"> sys</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> __name__ </span><span class="token operator" style="color:#393A34">==</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">'__main__'</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">	</span><span class="token keyword" style="color:#00009f">print</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">"\n"</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">join</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">sys</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">argv</span><span class="token punctuation" style="color:#393A34">)</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>The above example displays the arguments one below the other. The conditional expression <code>if __name__ == '__main__':</code> checks whether the script was invoked in the console. If the script was imported in another file, then the code will not execute.</p>
<p><img decoding="async" loading="lazy" alt="sys.argv" src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAooAAADaCAIAAACuB/QFAAAACXBIWXMAAA7EAAAOxAGVKw4bAAAJOUlEQVR42u3d2WGrOhQF0PSTVlxKGkkh7iOdvRdswiAkIQls6yZr/dwhTCbA5sgCvb0BAAAAAAAAwJkun1/Xj/fXrPv94/r19XnxSwCATUI+IZ7jQSyeX3oz0/8CD21I3aZ836ZOSudbztNwFo2zl61tta7KD7eYt2CuzZpqVwccL5ufd/qJZ/H8xL10/bhUbcpwLkwT306M2k8xrva9co66eG7ZtU2fZruIl7Wwwd/M5uGkfVLbtngWz086ru+H85FNaZu3Lj/HdZTP1BbPraHuJIVXpvPtMjace8Wn3tgQPt32T8EeKR1+Tup54kitPk60XWBQX6xrjORmlBcsYdEyXtXnBoXgh4Fp1y2nm6+E96XdZ5s3NJi04nNl9+F2E/d/n3ULDJYXW9f+As87oh4fKk3zvleX6z9H3APj+YSbb6UzvKAQG872qpP+5yo6nq3LC1J4Eof/zlTP8wJXM63nWDbR/cyVWHXmQrOYbD3XT+Ss7hq2m7FqKMzG8xzM1+t1vAFZbG/T50oFQHMSpWbMNPXm11W9JQeOqAfGc0siVc0zT1wXz7VfRd2XvpizOmfr2+yBUwroylM2PFfTV9PNVScTz0FmxkJsvYLMXK3NfsEXdHN/uWC7lx3pduJ5+Ou8pdMPw654i6Xsfa5cmjZdQdO/lOSdVX5dbfHcdkQ9LJ5vv72qpG1pw0kd5uU3NQXru29c4b3V2Tc5wAlVdOnFZXOyLq+g+YtO2XfP64RLtW2m56qrQIJ4ji+itXoO2r+nH0a7xW6L9pp4nn+LlSEdX2BmC/fW1RbPbUfUY4JlavhoO5UKZty2CjXEX+HKMve4D2tGAE4roYfTb2rrPnAxXWdReFJ3EM9BhbypnjPxHG1UbI/nxMY2x3NrSKfjeW93JtZ1bjznj6gHxHNxVXogzIKJjvTHLovn1VRVeat0hhcXz9HEKztfE3ny/UdhH+10IG2nT/+s7BIXPmBSFs+ZrnORdu+CeM7s6YPxXH9NPdqdfjPl4XiuOKJOz5agJfgxkRl/prh6xYVPS+3t3jPWATyyeG472yMNbOseVvsney6Q1jOkGj3L43m1ueELIbLxXJDP615duXjOtFTsfq6SX1N1F6VUh73C9ojIAZCYM/YSjiNH1KnxfDibG9OstT92zdc50x6tOTb0CYPXVc0N9+7hbNuZcid15KmmnUBalhqJ/kM1l7jFBoy9qku+AtzWO0HFvF1gNp7T+7Hgc8WeDAuWVndJzTy7FvvI++tKPrsW9Es/44gqLU93Z48/F7Z7TLV0pm6N5+R+rZmxfD8qneH1KX1ub9gnvSL0ufcxRU830bx//9QRBVB2i1z/GsLMxfT3dfTcxLM2v6fGs67DAEcupj+tfL/xShq2lv4rnzHdCamj6v9PHlEAAAAAAAAAAAAAAADQ7vTXg/S/QAAQz+IZAMQzAGResj++63N+7VQ4IG/kNVrZwY+/p5iGuR/Xmhj4Yj0owzz5VzBeVfZFWKtNLBzKqniBwfJi69pfYHpD0h/54MjKAPwTLp/BoPDzP9eD+C2DYBUKq0FtsvE8B/P1ev28LKfdjLocZtxiuqKhKZpzKzVjcgv31tU2+PGcyuFuWn5876IG+CNhncygxZhBQeAsBxPaiefhr3MNOP0wHI4oHEE5uIPYjE+YSNOm4IovMLOFe+tqi+fEwJnbGyilM8DvjeRo42v64t9aPacGP44O21A0FHQ6/Ka24cqQji8ws4V762qL5/CLhNj9j3QG+NXZPF/iN9VzJp6j36a2x3MiZ5rjuTWk0/G8l4SJdZ0bz+vdp2Eb4C+kc2E8DwlRlqaLf+biOZNgB+O5PiHTjdtlS9hMeTieE7vg+w+lM8DvtPqac2zALYvngnxe9+rKxfM4abKNOhPPmzuM1F1IcaEZX2B6C3fXldvCcJ9vP3Kku/a6zx4AvzWgpzbZVV2cadDdfhUbVMzbBWbj+W37GFKsk3h8o2JPhgVLq2sETjxqltjCgnUln10L+qXn15O4qwKAeDh4+vb0/Zvdm2E/cgDYxrNi7qnx7GFnANIJ0d543M92V73N6+Xx/LP1shkAAAAAAAAAAAAAAADOML6nyotFAKCfZL5+XLz3CwA6MY1J6LWcANBnES2eAUA8AwDiGQDEMwAgngFAPAMA4hkA/oLL59fG/UloAAAAAAAAAAAAAAAAAAAAAAAAoBvLF4d5rScA9JHNUybfglpCA0BXDIwBAOIZANgztG4brwoAOgtn6QwAnWWzcAaAbty+c5bNANBb4axHGADIZgAg6t6qHRLWAAAAAAAAAAAAAAAAAAAAAAAA0JP7Sz1HBsUAgM7c3vApoQGgv1LaC7cBoLvyWTwDQGe1s7ZtAOgmlnUNA4A+6RsGAJ0W0vIZAMQzAJAPZz23AaCLRJ6IZgAAAAAAAAAAAAAAAAAAAAAAAOjZ+P4wLw4DgF7cxpIUzwDQWTh/XoYCWjwDQBd+hpEUzwDQWTi/iWcA6MPYrj0FtXgGgI7CWTwDQA/mdm3xDAD9pHOckAaAjuJaMAOAeAYAAAAAAAAAAAAAAAAAAAAAAAB2bN677bVhANBDPEtkABDPAIB4BoB/Lp598QwAvXr/uH4n9PXj3a4AgM4CWj4DQFeGpm7xDAC9pbPvnwFAOAMAa/fuYPptAwAAAAAAAAAAAAAAAAAAAAAAQPcWoz57dRgAdBLNQhkA+gpn2QwAnaWz8Z0BoMPaefHNs6wGgB7iedG4fRteUlM3AHRQPQf/oYIGgFfH8yqNxTMAvFrYmq11GwB6q5/VzgDQUQWt4zYAAAAAAAAAAAAAAAAAAAAAAAB0ajHS84KXbgNAb3ntzZ4A0BEDVgGA0hkA2C+dpTMA9JbOGrYBoCMatgFA6QwA7JbO0hkAeiudNWwDgNIZAAAAAAAAAAAAAAAAAAAAAOje5dObwwCgL8ObPcUzAPRTNi95vScA9JDNQyJr2waAjtL5lspD27a6GQB6cBtLcsjlIafFMwB0VEDfaNsGgA6raCENAH2V0EMuT23dAEAPxfMtlG/5LJ4BoJ/ieX7MCgB4ZdX8VftSkv/S7FEAODOly2tm8QwAT1D3xjDxDADdEc8AIJ4BAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAIDR/2Q/G3bq1U3gAAAAAElFTkSuQmCC" title="sys.argv" width="650" height="218" class="img_r5VP"></p>
<h2 class="anchor anchorWithStickyNavbar_iC02" id="docopt---we-put-docstrings-to-work">docopt - we put docstrings to work<a class="hash-link" aria-label="Direct link to docopt - we put docstrings to work" title="Direct link to docopt - we put docstrings to work" href="https://dloranc.github.io/2017/04/09/command-line-arguments-in-python#docopt---we-put-docstrings-to-work">​</a></h2>
<p><code>sys.argv</code> doesn't allow much though. You have to take care of the correct handling of the arguments yourself, and if we want to have help (when calling the script with <code>--help</code>) for our script, we must remember to ensure its compatibility with the rest of the code. For simple things, however, <code>sys.argv</code> is enough. However, if we want to build a tool that will be intensively used in the place that programmers like best (i.e. in the console), we have to use something else. A quite popular solution is <code>argparse</code>. However, I used the <code>docopt</code> tool, which seems quite interesting to me because the arguments are built using a docstring. Cool thing.</p>
<p>We start by installing the tool:</p>
<div class="language-bash codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-bash codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token plain">pip </span><span class="token function" style="color:#d73a49">install</span><span class="token plain"> docopt</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>Let's write something simple, let's say it's a script that adds numbers. For now, however, we are not adding up, we will see what we are dealing with.</p>
<div class="language-python codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-python codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token triple-quoted-string string" style="color:#e3116c">"""Sum integer values.</span><br></span><span class="token-line" style="color:#393A34"><span class="token triple-quoted-string string" style="display:inline-block;color:#e3116c"></span><br></span><span class="token-line" style="color:#393A34"><span class="token triple-quoted-string string" style="color:#e3116c">Usage:</span><br></span><span class="token-line" style="color:#393A34"><span class="token triple-quoted-string string" style="color:#e3116c">  sum.py &lt;numbers&gt;...</span><br></span><span class="token-line" style="color:#393A34"><span class="token triple-quoted-string string" style="color:#e3116c">  sum.py (-h | --help)</span><br></span><span class="token-line" style="color:#393A34"><span class="token triple-quoted-string string" style="color:#e3116c">  sum.py --version</span><br></span><span class="token-line" style="color:#393A34"><span class="token triple-quoted-string string" style="display:inline-block;color:#e3116c"></span><br></span><span class="token-line" style="color:#393A34"><span class="token triple-quoted-string string" style="color:#e3116c">Options:</span><br></span><span class="token-line" style="color:#393A34"><span class="token triple-quoted-string string" style="color:#e3116c">  -h --help     Show this screen.</span><br></span><span class="token-line" style="color:#393A34"><span class="token triple-quoted-string string" style="color:#e3116c">  --version     Show version.</span><br></span><span class="token-line" style="color:#393A34"><span class="token triple-quoted-string string" style="display:inline-block;color:#e3116c"></span><br></span><span class="token-line" style="color:#393A34"><span class="token triple-quoted-string string" style="color:#e3116c">"""</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">from</span><span class="token plain"> docopt </span><span class="token keyword" style="color:#00009f">import</span><span class="token plain"> docopt</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> __name__ </span><span class="token operator" style="color:#393A34">==</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">'__main__'</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    arguments </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> docopt</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">__doc__</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> version</span><span class="token operator" style="color:#393A34">=</span><span class="token string" style="color:#e3116c">'Sum 1.0'</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    numbers </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> arguments</span><span class="token punctuation" style="color:#393A34">[</span><span class="token string" style="color:#e3116c">'&lt;numbers&gt;'</span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">try</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        numbers </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token builtin">list</span><span class="token punctuation" style="color:#393A34">(</span><span class="token builtin">map</span><span class="token punctuation" style="color:#393A34">(</span><span class="token builtin">int</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> numbers</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token keyword" style="color:#00009f">print</span><span class="token punctuation" style="color:#393A34">(</span><span class="token builtin">sum</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">numbers</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">except</span><span class="token plain"> ValueError</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token keyword" style="color:#00009f">print</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">'Cannot cast value(s) to integer.'</span><span class="token punctuation" style="color:#393A34">)</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>From the content of the docstring, it is easy to guess what to call and what arguments we are dealing with. Brackets indicate arguments, dashes (two or one) indicate options, and three dots indicate that arguments may be repeated. Brackets combined with a vertical line indicate mutually exclusive options.</p>
<p>Time for testing, let's call <code>--version</code> and <code>--help</code>:</p>
<p><img decoding="async" loading="lazy" alt="python sum.py --version" src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAooAAAA8CAIAAABuGgyaAAAACXBIWXMAAA7EAAAOxAGVKw4bAAAEQklEQVR42u3d65miMBgG0O3HVqYUG7EQ+7C09cJArpiMiojn/NmdUS6GefLyBST//gEAAAAAAADAM/0cTsf97j3b3u2Pp9Phx0H4BteD7XADtHeaC8RzOYjF89fF89tOBQE+qGwOvTYlxTMAtGXzJRgXGtsWzwDQks7XVL4EZHM+DgPhw0XEYKTy+ps45X+Td3pzoVYf3pSvMFhHXt1Xd6NzxGBcY3qWcHnX7cfbycttqWmTdxvsLw31nuMf78b0q7Chwp0bzuaml6M9LzdvfW0vOsoAH+vW6Z37wimJmpcaO8owWNL+Pv15pnqeVhgtFC8xVvvBUpVNd1bw9+J5Cubj8TicStxrsQcaaunRk1ITRH8U8Tt/gzY6vWo9zyi84/lHGWADBXRnVZLWfvXYyVJ/Jp6DFQaLZWuYNjCzVGfJ2BbPl/9O22zZ2gMNtegJWnH8IL1ZMEvuvMRtDND8j+AFRxlgO1V0a0hn3WvYI6cRNFclteVibRC4vlTrp00+bMNuTG9pjue/NdRjRzH7dPWXqomc3DIYDzrP7XCtedua5WlHGWAbJfSlexzHuh9InTjB0h56FfFcS5Fl43m+oRYePxkbtLGgv//pZ0JaPAO0J1WxL2xLnUoPev6n8R7tehc8Nw76lI47LR6TEdvnxnNHQ70hn5PbA2f/FlrburiS7JcvP8oAH108ZzcKNfW5hbuQ4xuHsn79bn8ddsH5LUml0eE/d9y129DiO5Ia4nn4zDNX2vsaavF83qc378+NpTx0nb+Q2S8+ygCfVzWf+h9Kki6WL1RIovLSTRVSeBW0ctW0o+OeuUKbXISfysn78ZzcYPyMhlr8D6EyDF36APW2br8Anrz+3KMMsJGU7unx2r464+svGgqAB/TemXQvdXw3VUMBsJ7U+R2fFDkaCgAAAAAAAAAAAAAAANYtfJbTAl/0GTbX9eCn4IFSvooEwPZdg2+KvHN2vvCBicMTLH+a5t2IdzGctEJCA7D9dF4q7cZnk+164rn0yGX5DMD2i+eZqQ1qkx+fA/I24DxMHNE1Wt0Tz6W9kM8AbNzvhefKZEW1eJ6C+Xg8Hn666uGueM6nFtz3DY0DwGdHdOtEfmPJPV0J7prl70/xPE7ttBPPAHxfSse1ai2e5yc/fnY87w/1/QKArQuuRa8jnvML2+4NA+ArK+hiPAc/LhnP2ZvHQW4A2G4ahzEZf6s4jerTy+N5ePpI8lL2vWcj2wBsXfBAriwZp3vGhtu0H4vneFP5Q8Dik4DKkrIZAAAAAAAAAAAAAAAAAAAAAJZzm8VZOwDAeniWNQCsq2yuP9cTAHhPNl8S2dg2AKwoncfZLdTNALAGt0mizrncNSMkAPD6Ajqb2xEAWEkVLaQBYF0l9CWXx7FuAGANxfM1lK/5LJ4BYD3F8/Q1KwDgnVXzyUNJAGClKS2XAWBVPDEMAAAAAIBP8R8jccMLTupgKwAAAABJRU5ErkJggg==" title="python sum.py --version, result: Sum 1.0" width="650" height="60" class="img_r5VP"></p>
<p><img decoding="async" loading="lazy" alt="python sum.py --help" src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAooAAADyCAIAAABF+HNeAAAACXBIWXMAAA7EAAAOxAGVKw4bAAAP5ElEQVR42u3d7XmyOgAG4LPPu0pH6SIdhD0c7agFyTcBlaLe959Wka/gxWNCIP/9BwAAAAAAAACP9PVzGr7//c26/30Pp9PPl4PwMiV//rbcOHAAzz1P7xDP5TgQz68Wzw4cwA7V5tBzT7biWTwD0JfNl/PrTm3b4lk8A9CTztdUvpxnu0+zY0P49dx8NQX79Z045acT+PzhQl19/FC+wGAZee2+uhkrWwxuS0zD5vKp35e/P15+55pXuVhgWwrqT+K5WoYL15jzjb7joAAQB9/5/DonUfdctzNveI6+5X2a/8u153mB0UzxHLfafjBXZdXrq3zNeJ6DeRiGMdCWSuyOgtr1C1Au+fhLEZb8QjxXdhmAlRXoldWctO5Xj50s9RvxHCwwmC1bwryCxlx9TQYr4/ny77zOnrXdUVB7xnO5DNPOgqVDV4nnyi4DsLES1RfS2Sk3jJo0guJTc9+15zgXa43A9bl69zbZ2Y7NmD/SHc/bCuq+o5jtXW1SowyTLoPFBu5KPNd2GYDVVejLKfTW1n1H6sQJlp6YDxHPtZDeN57bBbXn77J6PK/eQfEM8OCTdDHx+k7KlXP8+U9nH+16SOSfr0/bVAUNF5IsMHj5oHheUVB/H88934XleNa4DXB35bnc/WfppFzohTy1ixYW1dXDKIy+eIZa6/DmeK51Q4v7nXXE87jPjaxaV1B/Hs89bSlL8VzYZQD6Ts7rH0qSzpbP1DotF26TWgja8Cpoo9dRZzw3rtAmF+Hnu82W4znpRv6IgvrreG7sQOk2uaC/3skTPwEedI5ecxpdbrDc6RGhL/Hz57MKSmM2wMOs7Zm0dArWG+hzC0o8AxzvFDy1Qsvmjy0o8QwAAAAAAAAAAAAAAMCLCJ/z5I4oAPh71xtw51A+R7V7VgHg79NZjRkAjld5LteXm4MfnyP998FX48ARxj4AgMeZLjznVehmPM/BPAzDz5cnOQLAkyK6d7jGW5V7Hg1x01DLAEBnSk8J3Y7n9uDHAMDjBNeixTMAHKgGXYzn4KV4BoDnpnEYqvO15FJUn1bE8ziescQGgC3GIB3FeTr3GRu7affGcxznAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAADwV/59D6fTz1fwztfP6TR8/1M0ACCeAQDxDAAvG8+XlzfRB6+zFqfEM13NSwynxXPdlunXAQDiuR7P+dTgY0mG315GM13DeF7C5eXtVTJNPANATzx3N3QHsZss8fIyTPtwcY30BwDxXInnuQG7ENJpC/a0lHrtOW/zPolnAFgbz9WQbjRaR9eko4VHTdsAQE2axvUG53BKetk4adyuZLDGbADoz+e4u1blanPWZyy8Qn2qXXvO87kV0LqGAUCY0Kf8InPSTB2lZjDt/H5UZc4vMQeBnCxTz20AeL6oXr1YnQYA/iKe87wGAHaXNm7LZgAAAAAAAAAAAAAAnsdN3wAgngEA8QwAz/DUYS73jefL2jwEBkCuFYa2SANpTr/zf+fs+J1rHEbj1JFdY+bMY2kUh83alobpEBzVdW3dr3Gu0gKzMgy3+3eZweRsTO7yoCKLY4IB8NaqQdiMsTnAhmEYU2spS6ZoG4MtHXY6zLvGcJjL0Ryuq7D8bfuVbHw+emc8zlc26lcU8p0/QX7nFdEAn1p1LibhQoxd/p2rvT0Ny2kluZ7PK5qpKxGWritY4rb9amx82hIdrSAO6+izPb9BDMwJ8NHV50JzbTvGrh+eP9Idz0nDb7FO25vOjdplfeM37ldj4/NhsU9xPNf2pVbyKtEAVKNi33iOl9VfXWzVnp8ez/Ns1X1fLphmSKs9AxCHUBJIwcsHxXMlJ89/1tUUq9eeu+K5c79KG1+4iL4hniu/XVSbAViozyY9rTrieWzwzTsjxx3D40yNu1FtqPz3dM/euF+FjY9fVbZ6TUN9dgG7GfqVydsmAXDA+vKs3P/51p25N56TjtPFVRViohDZK39b1HpFx9u3Zb+SrU82vrpv9XhulfzCfc/F4r1nEgAf/kugmQ2exgEAB4vnVTc7AwBPjefpmrNsBgAAAAAAAOAT6FwOAOIZAF7ZPY/PPFY8b93C2pjTDyo4ANiSaH2x+b7xvHX29gNEAWB7xbk7XsTz41cJAHdGi3i+/ycOAHtVPrOLls1xkc8J9TvXOKZEz6l9TLZoPIp5Te0BIRe2fk1gjovON+No8VzdwoVrzHnZVUteBRrg5eqezXieg3kYhjFJusZ7nrMhnCeN2BWRu/pJ3clmrJx/t3iubWHUkatY5S3Hc6Xk5TPAcavOxbxZiOfLv3O1t6f3b1pJrufzir7E6zMl3Yx1HZd3i+fyFqarL+1+JZ4rJb/zvgGwsq6Wnpnb8dwe77kvScNMTrN6TUevcqTUxlNuDwX98HhuDOu8YQuj6xCtsafzeK6VvOozwEuF9L7xHC+rPwA31p73i+etx6Iez6vLeU08qz0DHDWkg3bU4JwevHxQPFdS6PxnTdxuu/b8qvHc82tkOZ7LS1F5BjiqSn32VrfujuexEbYREoXu2lPL7bqI2Nhz+zXjuecRIkvxXHmEi3QGOFp9uXxpNJw4dtPujec4zsurqvUdXh1+K+/Yfe14bpRj+v58PJdLXjoDfPYvgcV635bs2/GRGi95fbav5IUzgHiuhOzW5NtW7xbP95c8AO8az9M15/sSYp+Bl94wno1YBQAAAAAAAAAAAAC8g94RsQGAnePZncUAAAAAAI80PbUrHi6hOd7z8P3vd65xrIyeK6vjA7eicTbmNRUG43jKtdr8AZbJ6sPSSAbhunxonhxtXrkM60ub15xPrRYUAB+jGoTNeJ6DeRiGn6+uNJ3iZgybcJ40M5/5EOh0/Ixo26MHXMafjIe7jObaNORE/Fa4rmS8L0/EBvjQqnPx5L8Qz5d/53pnz3Ob00pyPZ+f+xTowujHQS6GZZEld17F7QzQPJ6zfZwXkhaUh2IDfGr1udCE2o7n9njPfREVplqa1Z1x1Biuuj6pmshRC3Xa6NzaqloZ9u17+pH7RqQG4K1Det94jpf17Lbc2+bG9eXmXizvYiOkxTMAd4Z0eAU0bsx9bDxXUuj85/lZNK7rsgmNLVoZz42FZG+2WrvFMwBZBJWqz3FnpY54HpuJo1ApXPON649x56sd8vk7TudpP8vr7w3J4oXoQmbnHc9KDf3iGeBz68vli7fhxLGbdm88J32Pi6sqJE4hsp+845Vm6NJW1kOy/wJ4Mj281l25Ni6eAdghEBdvP3ILEQAcKJ7d3wsAx4nnqY1XNgMAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAb/PseTmc/X4oCADb5+jlNhu9/qxK4lr+/8bxiaQBAnKO3GL0GdWeVtxnPAMBdFeeojpu9IZ4B4I/T+feta+he4vc8bbyKHDRVz+9EpqQOmsrz9I7mnadW1/VfvsxsoRrSAXgrpQrwmJRzko65l312ofZcmBy/FTakT+saX8a/Gpaq6eIZgLeL5zTW4ngOpqYpuTqeb/XyvO6eriv6aH97OwB8RO05nJjE5Np4zlN2/kj64TjJb23eQhqAT5BVaJNrz8eIZyENwIdXn+dgXGrMXhvPa9ZViueutQLAu9Sf4z7Z0YtbEKZXh/9bukm6kKPxDGFtuj+eCxeidQ0D4F2r0PlNS+n9U4W4LN0mVbrtao7O8A6pRkN5GM/JAvMYFs8AfFhqa0YGAPEMAIhnAAAAAAAAAAAAAICXs2d/763rao5IreSfvRnjc2bcFAAgng+cZ9u2e/nJa39U/L3x7ClvAOL5feI5ekb6dR+ih5++QjwDIJ7fLJ4LI4K8XO0ZgD+L57kB9omNmEvrWrjGnAfJ5Z3zMnbZ+Dsqz82hNmsbXxq+pDwy2FgKC+Wei8YGLW9G+4iEU/sTvjlXMDHYjvO7l1fzxGyc08ryypOO/7UBuJ26xzNUo7b35HVFyVNMtXI8Bws8YkUw2cTO0oj3ZC6N4uGpj+bZXXte+AKU5txW2K25susA319JyhYGG298baqTpl0eXz71Ow9wT3zEFZWnJVxjXWkNsC8Q0gUetKE2qL2mG1sujewgTAkyF1Ppv+3xvPAFKM25LdTqcy0Gd1b7bX9tFibt9Z0H6I+IKCZaIzdvWuC2dUUtnpWWyko8h+8cuiI0Fsu0ffXSyPfi9tnpQ0Fv6kfE88IXoDjn7SivKvDqXK0vXnVa42vTmLT5Ow+wc2bsdKpaiOel1b56PCf1wE3xPGXx18/w/fU9TFdk+39QPS6eHxvSzYaPZjw3atyVSeIZEM+96+ppll6O5+P3Qk6bWIulke9GFN3nSLuk87Wp+/p/ZzQ+JZ7vKfjsIvLqqF363dA3STwD4rm6rt/61PLjMRrxnF5SPEbphhscbWGrNPI+TuFcwzAEl16HofeAdXW42xTPj7gQnX4Dkq5hrcp7PbqLk5Z3eWwZl9mAeP4vv/8nTu7YHE+n07Gf+BlfA21U9ZPSCOfL7nWKIn5FMBbu1lr+zVQq+VbXg971Vzqhl2+sqh/bxpegMmnxO5907QbgvrwHAMQzACCeAQAAAAAAAAAAAADuNj4P4iM6Y795t/NPOpQAnxHPbzb4bvjsquOPfulQAvD+omdwRg/hdtM2APxdOldqkOIZgIckSzIkVDh+QTLo8jQOcWGkimiIiMoQEOVRD/Op4+jH89TDtaU2hkkc47m28cVdLo4NMY0B7VACfEw+V4dYioIiH/+wMiLSco2x8In4rXBdychC24Y1fK5pE/PtSqZEG1/b5eIedgwy5lACvJHCcMrByTQ8e2an+7xe1HnWzc/pWfrMC0lHeH7maJgPiOhC17Dyxld3eS7L0n8OJcBrRm15pN36pOppPB7EuKcFNl1X7cyendMrDbPXj+w5WPXDDkE0VnNx4xu7PH0o6BfdFc8OJcBbuZ0l4xBonjyXz6yNM/ubn9OD6uimeJ4Ow9fP8P31PUwXhnt22qEEeL98vpw9a9cu15/TGwvJ3mw1kb7eOT3Y4vrGt3b5t7PWJZ2vTd3X/zuv0jqUAO+Wz9/xKX2qNJXPn2uqc1mwFE70eW+l5fbh46Rxlk89G9/e5WEYgsvOw9C7z699KMdWeJkNEJ6+6z2Ps5tk6uf0/gvgyfTwAmnjwVtHrHLF13Yb1dZk4yu7HD/nZOWjuV75UCZduwEAAAAAAAAAAAAAAAAAAACo6H6YIwCwk76BigCAvarNledEAgB/ls3jyL8qzwBwlHS+pvK/70G9GQAO4Tb6n/FwAeBgFehsED8A4CC1aCENAMeqQl9y+dbWDQAcofJ8DeVrPotnADhO5Xm+zQoA+Mta88lDSQDgoCktlwHgUDwxDAAAAAAAAAAA4F38Dwn9K2cf8ovIAAAAAElFTkSuQmCC" title="python sum.py --help" width="650" height="242" class="img_r5VP"></p>
<p>Sprawdźmy co się stanie jak wywołamy z argumentami <code>1 2 3 4</code>:</p>
<p><img decoding="async" loading="lazy" alt="python sum.py 1 2 3 4" src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAooAAABkCAIAAACOx4YAAAAACXBIWXMAAA7EAAAOxAGVKw4bAAAJOElEQVR42u3d62GbOgAG0O6TVTJKF+kg2aOjtU6IQUJPLB6Oz/lzm2BAlnP1WQKkX78AAAAAAAAAYKT3P38/fr+dc+633x9///559yEAwCohD4jndBCL57EV3FeZ/7+Y3bXut9xnw9/NtLuPHKCxpd27yRTPO9ftx+/3rsq8ffr3F3/+KfR+DtNp3zr3EM8AtWy+tZIHjW2L5x0/ya8P8JHK3LZvEPGt5+jbCeD10vmzUb+1mc2N5TQQfu8E3YM90ZH6bvDnFyf66tOL1geMelthjytbjM4Rg/sR43CaE+Qr+772mk9ZrbAtFbXTN6Ad933r7q7f3rl4Bqi0rLdWsqux/I6bKV+WzfM97+P8r/ee5wMGO4V7LIdfv/fKnLo/TorxPAfzx8fH9FWiVmMPVNQ58bylFF37zC8WzwBtHcn2Zjnu++VjZ9UGF+J5ccDFbqsjzCco7LUtTyrxfPvnfM6Wsz1QUWfE8+e77Erazr+c+P2LZ4C2bl5bU7sKgGXUlJvgtmvPYS7mBoHze7W+2+jNNhRjfklzPG+rqMPj+T5AsO2EDTuux0HEM0C1bb61r/ex7gdSJ0ywuNW+RDznQvrYeC5X1LHx/FUXm8vQNL4dvUg8AzS1zsnEawuATKz9/0/jPdr5XFy/Pr9tU4O/PEh0wMWPg+K5o6KOjOevjvMDRWiJ5+j5vcMe5QN4/s5z63OvYQAkxja/m+LEoZKnKAVtuENudHhzPOduQwvvO2uI59RMG49U1FHx/HA2b3taWu8ZoNKS9/dk4t3WO5UuRyYek6oE7bLjlbnRqqPBj0qfe4Zruk27NZ6j28hHVNSWlIxUj5x+3q1ai+G5Nt+HJp4Bam10T0tZ758dNEXoU3z9UVEAbO139U/KWEidcc/w/vB4VlEAHJA632OeIkdFAQAAAAAAAAAAAAApuQmsHp9VarJ1JaXlFBi9uw8r/KO1avJKAIZGZ98qgzvE8yO7Dyt8alKu3uU3xDMA3dGXC7Fh006eEc+D58zcXHrxDMAO8TwiWU6L5zGxKJ4BuFA8Dz3J/4gKFp2I868wbrxOuKnU2QPWS9OzRzGeE8t8tMRzMGAebnbRGoDj4nmOxHC+6SD8mhadXB2wr4c6Mp7f/yyOk5xHO1W2yvX+cm0AIJ0HxnOUY1PqxAVoS7P4gPuOIK9uDcvVWCrHU0XLrofRUhsA/PRg3nTvVGHR5NymwvLMqaWKG+N5+ZtdF4AqDm7Hb6AlnueKKozyG+AGePGQ3r39r8Rz7fzXjedo9Lm195wN6YduQgPgZ+XzUbeGpfKu5ftBPZ73H9xOHjy+NtwZz+sXGMwG4BLxPHUhi5FUi+fOR5zH3RoWnHgamO6O56DjX62Nr7NYLBpAPO8bz79Wl6yj5E7dlZXb4+B4DgoyPehVL3zp6n31vX1v1cUGEM+XLPWrJtTA2UoBkHTiedg713cGeJn+2FM1+i8Zz9O4to4zAAAAAAAAAAAAAPwsX3d0P+cN0tNtzgMLf3ptWAcagHsePOlzPJsX5DqiNtYLU3XErXm5AV5c51zWaqMnnjcHrHgGEEiCYI/aEM8AnG+9GHPUE81dVf3/+9uL5s1BKgVDxNGCG9lB42DBiXnrNPX4vLWnmzxyjY1sCWvxnKuNXy5aA5CPo+zyzEFWha8MZxsN9qr3IROvCH+1PFe0ENT6C8VR8fz+Z3GcZDFS77xQG4XqBeDFJdZnXuTiMoFWyb3u4jYG6DqyVqk4HyS+sPzQ8HPb15XCmpLFHE9FcbY2itULwE+L2nSy5DdlE3l9E/PfTMc6V4xctK2SKDPG/vmSyorUO8Rz9uhxjbTEc7Y2itULAPdICjt01aQqZ0khpJ8ynqPxgtbec7Y29n0fAPyYfL5FSO4ydH881zqUy1+WRrsvEs/xteHOeE6PVMhnAGqh9DtM5+8eX8NtTZUjt9xCtb7x7Huvh+J53K1hwSXwaWC6O56D2ihV7+Isnm4HeF25GIsuWYcBWhnGrV4Aj7YvL8Zmro2fGM9B6acHvYJ71dPvq1AbpeoNtupiA8DFRjV0nwHgckMa+s4AcKVk1nEGAAAAAAAAAAAAgHN9TeFx2jM3F5uQcmxtLOcKWd86ve1cwz8vxdirGPlFvcNFVAGyLct5j91cMZ6H1Ua80mPruaZ2PT9j98jPK3PA5cxriYIcU4xgLa7U2Y6qjbA8q/oo75X7G+9bcBx4NfEiyK8dz2NroxzPyXNNv3zPVsvwzyt1wGBC0ngJj+OKUd9+aDHuveHVB1MuRvZvvPz3AYjnSj7uurDTsfFcbQ/HFqchnlfLeXy9/q0czyMrrH7A1CtOKMamlb3GFWPakvy/oVwM8QzslJ6LJmRqUOarauGm9IITX6FzXzUpGLqd9kodcO66JcZYpyCbNyev7a23bJ7esntJjIea30sNKlyjMCeX4nskesOXVfEM7BvN8+/ujU3bco1BMH98fExp/Lnx+4DTUdarQWbGWMOba4JzNy3GuK2Nfc14vsQ10nMLMZ9dPAPnt4YNF9kWjVUlnm//nPe+b4wPuDhK3HgFJ4guiC5f29KQbwnbbSH79PF88r2D80DINW6QGBnP8hkYEs3lDK7E82cjNL8kjOdo1HrRT1lZxnOulbyPlJdbvv0fKXv+eL4PfFxkMOeMkoSfxdB4Pu6bIvASvefd43neLduo1VvJYkjrPbcW4kLBcc74dnTWofHs0SpgVOvcHs+LH3vieZFoxYhqbSVTBzlsJpanjueT56u5Spglx3D65hRx7RnYP6IL8RxfNp5bsPZ4TvyUaQVb4zlu0n/wndsjB6IfyOa9xsOTj1+fUwy3hgGX6bTk7ooOG6vg8ajbD/V4Dmc+TD9em9qcbyWjXYLm74H28Jh4TvXVWo8Q3VX/8Jeyjb3FccWIaqPvgAOLIZ4Bft7Yw5HN78k3WSuGeAYQz+n+4jWmD1GMhnjuHK4HYGhEHPDU7nSis3tiipH9+NOz2uk6AwAAAAAAAAAAAPxA0wLJAMBlmAEBAK7Vbd4+NSIAsE823xLZ2DYAXCid74tR6DcDwBXcJx7esM4OALBnB9r8vgBw1V60kAaAa3Whb7l8oUX2AEDneboxLLswPABwTufZ+u8AcI1e81+TkgDARVNaLgPApZgxDAAAAACAZ/EPuOFg9B+qatgAAAAASUVORK5CYII=" title="python sum.py 1 2 3 4, result: dictionary with arguments and options" width="650" height="100" class="img_r5VP"></p>
<p>We see that the content of the <code>arguments</code> variable is a dictionary containing three fields <code>--help</code>, <code>--version</code> and <code>&lt;numbers&gt;</code>.</p>
<p>Okay, let's modify the code a bit, this time let's add up these numbers.</p>
<div class="language-python codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-python codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> __name__ </span><span class="token operator" style="color:#393A34">==</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">'__main__'</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    arguments </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> docopt</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">__doc__</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> version</span><span class="token operator" style="color:#393A34">=</span><span class="token string" style="color:#e3116c">'Sum 1.0'</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    numbers </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> arguments</span><span class="token punctuation" style="color:#393A34">[</span><span class="token string" style="color:#e3116c">'&lt;numbers&gt;'</span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token comment" style="color:#999988;font-style:italic"># we need to cast it to an int because we are dealing with a list in the form ['1', '2', '3', '4']</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    numbers </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token builtin">list</span><span class="token punctuation" style="color:#393A34">(</span><span class="token builtin">map</span><span class="token punctuation" style="color:#393A34">(</span><span class="token builtin">int</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> numbers</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">print</span><span class="token punctuation" style="color:#393A34">(</span><span class="token builtin">sum</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">numbers</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>It will display:</p>
<p><img decoding="async" loading="lazy" alt="python sum.py 1 2 3 4" src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAooAAABACAIAAAAS+sEgAAAACXBIWXMAAA7EAAAOxAGVKw4bAAADw0lEQVR42u3d7XmqMBgG4O7jKo7iIh2EPRztFEQIIUCCCunpff85HyWg0SuPb0ri1xcAAAAAAPxq1+97c7ucc+3Lrbnfv69eBACYJeQB8ZwOYvH83g4u68yfD2aD3HZhmx3vm765lxwgc6T99JApnj/ct83tWtSZ7as/HNy9FUpfh/6yl8IW4hlgK5vbUfKguW3x/MFX8vECvtKZ+9pOIj73GmWNAP5eOneDejtmZg+W/UT4UAQNwZ4opJ4D/nhwolbvD5qfMKq2phXX4sMonDEYzhiH05ggj+x7tBovudlhezrqQ5+APtj2Ulyut89cPANsjKztKFk0WD7jps+XcHge8j7O/+3qeTzhpNG0RTj9+my1cOnyOFmN5zGYm6bpP0ps9dgLHXVOPO95FEVtxoPFM0BeIZk/LMe133LszMbglXgOThg0m51hvMBKq315shHP7V/Ha+Zc7YWOOiOeu2dZlLSF75z4+YtngLwyL2+onQVAGDXrQ3De756nubg0CbzcKvfZRk8242GMh2TH876OOjyehwmCfRfMaDifBxHPAJtjczu+DnPdL6TONMHiUbuKeF4K6WPjeb2jjo3nR1/sfgxZ89vRQeIZIGt0TiZeXgAsxNrPH5n3aC/n4vz45Z/tGvDDk0QnDP75pngu6Kgj4/lROL/wEHLiOVq/d9hSPoDfXzznrnudBkBibvM5FCdOlbzEWtBOGyzNDu+O56Xb0Kb3nWXEc2qnjVc66qh4fjmb962WVj0DbIzk5ZVM3GzeaO3XkYllUhtBGxZeCzdaFQz40aNfWsPV36adG8/RbeTv6Kg9KRnZPHN6vdtmL06vtfs+NPEMsDVGl4yU2/XZQVuE/oqPPzoKgL11V/mmjCup8741vP95POsoAA5Ineecp8jRUQAAAAAAAAAAAAAAAFC9fu+o+RKgYHcoa4AA4NBkbm7XxArdcC/lN24+CQCsGvYLm2+gkdrRWj4DwNFFdBjPqXSWzwBwajzPv7nxVvw9wgDAZ+J5+E6li3gGgBri+TarocUzAJwWz/O1Vu4NA4CT43n2P8MkNwBwVjwn1j2b2QaAIwTbgqX2Bwt+LJsBAAAAAAAAAAAAAAAAAAAA4M+5ftuzEwDqYkttAKirbA7ZvBMAasjmNpHNbQNARencpXI7t61uBoAadN8T2eZym9PiGQAqKqDj75EEAGqpooU0ANRVQre5PMx1AwA1FM9dKHf5LJ4BoJ7ieVxmBQCcWTXfbUoCAJWmtFwGgKrYMQwAAAAAAAAAAAAAAAAAADjJPwOaTa7JtdnOAAAAAElFTkSuQmCC" title="python sum.py 1 2 3 4, sum: 10" width="650" height="64" class="img_r5VP"></p>
<p>As you can see, the <code>docopt</code> library is very easy to use. You don't have to mess around with documentation like in the case of <code>argparse</code> and similar libraries.</p>
<p>More details about <code>docopt</code> can be found at <a href="http://docopt.org/" target="_blank" rel="noopener noreferrer">http://docopt.org/</a>.</p>]]></content>
        <category label="DSP 2017" term="DSP 2017"/>
        <category label="Python" term="Python"/>
    </entry>
    <entry>
        <title type="html"><![CDATA[Five pools]]></title>
        <id>https://dloranc.github.io/2017/04/09/five-pools</id>
        <link href="https://dloranc.github.io/2017/04/09/five-pools"/>
        <updated>2017-04-09T00:00:00.000Z</updated>
        <summary type="html"><![CDATA[Continuation of the fun with writing a bot for Starcraft: Brood War. I created a simple bot in Java that can do 5 pools.]]></summary>
        <content type="html"><![CDATA[<p><em>This post is about the Starcraft bot I am developing using machine learning. The project is being developed as part of the "Daj Się Poznać 2017" competition.</em></p>
<hr>
<p>Since the last post about the project, I've been working on the bot for quite some time, trying to write something that performs some simple strategy. After installing BWAPI, the bot starts and plays with the standard bot included in the game. As I had known the default bot and its weaknesses for years, I decided that my bot would perform a simple strategy called 5 pool with some modifications.</p>
<p>It looks like this:</p>
<ol>
<li>If there are 50 minerals, then create a drone.</li>
<li>Take the created drone and send it to a potential enemy base. The maps on which bots play <a href="http://sscaitournament.com/" target="_blank" rel="noopener noreferrer">SSCAIT</a> have from two to a maximum of four possible base locations, so a drone can be sent to one, two or three locations depending on the map.</li>
<li>If there are 200 minerals and five drones, create a spawning pool.</li>
<li>If the drone encounters an enemy base, let it attack the nearest building and then run back to its base to collect minerals.</li>
<li>If the spawning pool is already built, create zerglings (and if necessary, overlords if you can't create zerglings because the unit limit has been reached) and let them attack the enemy's base.</li>
</ol>
<p>As you can see, the strategy is very simple. In point four, I use the exceptionally stupid behavior of a standard StarCraft bot. If his base is attacked, he takes all the units and tries to destroy the attacked one. If we start running away with this unit, the bot will chase it to our base. This behavior has a huge impact on the bot's economy, because it does not mine for quite a long time, which means lose of the game. This play is not necessary, because without it you can easily beat the bot with a 5-pool, but I like to bully the artificial intelligence :)</p>
<h2 class="anchor anchorWithStickyNavbar_iC02" id="the-code">The code<a class="hash-link" aria-label="Direct link to The code" title="Direct link to The code" href="https://dloranc.github.io/2017/04/09/five-pools#the-code">​</a></h2>
<p>I placed the code of this bot in <a href="https://github.com/dloranc/five-pool-bot" target="_blank" rel="noopener noreferrer">a separate repository</a>. I don't want to litter the competition because I won't write in Java.</p>
<p>This time I will describe the most important fragments of the code:</p>
<div class="language-Java language-java codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-java codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token annotation punctuation" style="color:#393A34">@Override</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">public</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">void</span><span class="token plain"> </span><span class="token function" style="color:#d73a49">onStart</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">	</span><span class="token comment" style="color:#999988;font-style:italic">// ...</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    isScouting </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token boolean" style="color:#36acaa">false</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    isScoutingIdle </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token boolean" style="color:#36acaa">false</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    isSpawningPool </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token boolean" style="color:#36acaa">false</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    scoutDrone </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">null</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    buildDrone </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">null</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    hatchery </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">null</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    playerStartLocation </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">null</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    possibleEnemyBaseLocations </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">null</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    baseToScout </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">null</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    enemyBase </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">null</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    enemyBuildings </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">new</span><span class="token plain"> </span><span class="token class-name">EnemyBuildings</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token comment" style="color:#999988;font-style:italic">// ...</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token punctuation" style="color:#393A34">}</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>The above fragment prevents various bugs. If we do not set the class fields at the start of each game, the bot may behave incorrectly in the next game, because the bot instance is not created from scratch, but is used all the time, and the <code>onStart</code> function is called before the game starts. For example, I had problems with the fact that the drone selected for building could not build a <strong>spawning pool</strong> because it simply did not exist. <code>buildDrone</code> contained a reference to an object from the previous game.</p>
<p>The most important thing, however, is the <code>onFrame</code> function called every frame:</p>
<div class="language-Java language-java codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-java codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token annotation punctuation" style="color:#393A34">@Override</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">public</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">void</span><span class="token plain"> </span><span class="token function" style="color:#d73a49">onFrame</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">float</span><span class="token plain"> supplyUsed </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">supplyUsed</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">/</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">2</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">float</span><span class="token plain"> supplyTotal </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">supplyTotal</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">/</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">2</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">int</span><span class="token plain"> dronesCount </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token function" style="color:#d73a49">getDronesCount</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    enemyBuildings</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">update</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">game</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token comment" style="color:#999988;font-style:italic">// ...</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">for</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token class-name">Unit</span><span class="token plain"> myUnit </span><span class="token operator" style="color:#393A34">:</span><span class="token plain"> self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">getUnits</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">myUnit</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">getType</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">==</span><span class="token plain"> </span><span class="token class-name">UnitType</span><span class="token class-name punctuation" style="color:#393A34">.</span><span class="token class-name">Zerg_Hatchery</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">supplyTotal </span><span class="token operator" style="color:#393A34">-</span><span class="token plain"> supplyUsed </span><span class="token operator" style="color:#393A34">&lt;=</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">1</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">minerals</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">&gt;=</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">100</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                    myUnit</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">train</span><span class="token punctuation" style="color:#393A34">(</span><span class="token class-name">UnitType</span><span class="token class-name punctuation" style="color:#393A34">.</span><span class="token class-name">Zerg_Overlord</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">else</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">supplyUsed </span><span class="token operator" style="color:#393A34">&lt;</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">5</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                    </span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">minerals</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">&gt;=</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">50</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                        myUnit</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">train</span><span class="token punctuation" style="color:#393A34">(</span><span class="token class-name">UnitType</span><span class="token class-name punctuation" style="color:#393A34">.</span><span class="token class-name">Zerg_Drone</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                    </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">else</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                    </span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">minerals</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">&gt;=</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">50</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                        myUnit</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">train</span><span class="token punctuation" style="color:#393A34">(</span><span class="token class-name">UnitType</span><span class="token class-name punctuation" style="color:#393A34">.</span><span class="token class-name">Zerg_Zergling</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                    </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">myUnit</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">getType</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">isWorker</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">&amp;&amp;</span><span class="token plain"> myUnit</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">isIdle</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token function" style="color:#d73a49">gatherMinerals</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">myUnit</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">myUnit</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">getType</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">==</span><span class="token plain"> </span><span class="token class-name">UnitType</span><span class="token class-name punctuation" style="color:#393A34">.</span><span class="token class-name">Zerg_Zergling</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">&amp;&amp;</span><span class="token plain"> myUnit</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">isIdle</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token function" style="color:#d73a49">attack</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">myUnit</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token function" style="color:#d73a49">scouting</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">dronesCount</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">dronesCount </span><span class="token operator" style="color:#393A34">&gt;=</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">5</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">&amp;&amp;</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">!</span><span class="token plain">isSpawningPool </span><span class="token operator" style="color:#393A34">&amp;&amp;</span><span class="token plain"> self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">minerals</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">&gt;=</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">200</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token function" style="color:#d73a49">buildSpawningPool</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token punctuation" style="color:#393A34">}</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>First, three auxiliary variables are initialized: <code>supplyUsed</code>, <code>supplyTotal</code> and <code>dronesCount</code>. The first two are actually unnecessary. I created them because BWAPI returns all supply values ​​occupied by units multiplied by two due to the fact that one zergling takes 0.5 supply. I'm used to game values, so it was easier for me to work with these variables. Whereas <code>dronesCount</code> is the number of drones. It is used to determine when to build a <strong>spawning pool</strong>.</p>
<p>The entire <code>onFrame</code> function includes checking and updating the set (HashSet) of enemy buildings, the order of building and producing units, and orders for units. It's all terribly sloppy, but I'll just write that it looked worse, so be glad you don't have to read the original mess I created :)</p>
<p>What else should I describe here so that the post doesn't get too long? Maybe what the attack looks like:</p>
<div class="language-Java language-java codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-java codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token keyword" style="color:#00009f">private</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">void</span><span class="token plain"> </span><span class="token function" style="color:#d73a49">attack</span><span class="token punctuation" style="color:#393A34">(</span><span class="token class-name">Unit</span><span class="token plain"> myUnit</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token class-name">HashSet</span><span class="token generics punctuation" style="color:#393A34">&lt;</span><span class="token generics class-name">Position</span><span class="token generics punctuation" style="color:#393A34">&gt;</span><span class="token plain"> enemyBuildingPositions </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> enemyBuildings</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">getBuildings</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token operator" style="color:#393A34">!</span><span class="token plain">enemyBuildingPositions</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">isEmpty</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token class-name">Position</span><span class="token plain"> enemyBuildingPosition </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> enemyBuildingPositions</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">iterator</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">next</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        myUnit</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">attack</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">enemyBuildingPosition</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">else</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">enemyBase </span><span class="token operator" style="color:#393A34">!=</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">null</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            myUnit</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">attack</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">enemyBase</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">getPosition</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">else</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token class-name">ThreadLocalRandom</span><span class="token plain"> random </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token class-name">ThreadLocalRandom</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">current</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token class-name">Position</span><span class="token plain"> randomPosition </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">new</span><span class="token plain"> </span><span class="token class-name">Position</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                    random</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">nextInt</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">game</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">mapWidth</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">*</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">32</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                    random</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">nextInt</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">game</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">mapHeight</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">*</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">32</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">myUnit</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">canAttack</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">randomPosition</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                myUnit</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">attack</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">randomPosition</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token punctuation" style="color:#393A34">}</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>It's simple - if the scouting drone detects any enemy buildings (i.e. the enemy's base), the zerglings attack them. If not, the position where the enemy's base should be located is attacked. If we don't know it, zerglings are running around the map looking for the enemy and his buildings. This number 32 when drawing positions is the size of one Tile.</p>
<h2 class="anchor anchorWithStickyNavbar_iC02" id="summary">Summary<a class="hash-link" aria-label="Direct link to Summary" title="Direct link to Summary" href="https://dloranc.github.io/2017/04/09/five-pools#summary">​</a></h2>
<p>The bot works quite well, sometimes it goes a bit crazy on large maps when the scouting drone doesn't find an opponent in the first two possible locations, but it's not a problem, because it wins anyway. Another problem is the <strong>Electro Circuit</strong> map, which has <strong>Psi Disrupters</strong> in some places. Starcraft's extremely poor pathfinding causes some units to hang on these buildings while trying to move on. To get around this, I would probably have to write my own pathfinding algorithm :)</p>
<p>A few things could be improved:</p>
<ul>
<li>Mineral gathering can be optimized according to this topic on <a href="http://www.teamliquid.net/forum/brood-war/484849-improving-mineral-gathering-rate-in-brood-war" target="_blank" rel="noopener noreferrer">TeamLiquid</a>.</li>
<li>
<strike>The scout should be a created drone, not one drawn from the initial ones.</strike>
</li>
<li>
<strike>If a drone intended for building intends to build, it should collect minerals if it has collected any before building. Now it's like that if she had any, they are lost.</strike>
</li>
<li>
<strike>Scout may be more optimal, the drone should go to the bases that are closest to them.</strike>
</li>
<li>
<strike>Sometimes when a drone is going to the last base on a large map and zerglings are produced, they move to random places on the map because they don't know where the opponent's base is. This can be improved by sending them to the base where the drone is heading, because then you know that this is the right base.</strike>
</li>
<li>Zerglings can fight better, you can apply priorities on what to attack first. It would also be useful to withdraw severely wounded units to regenerate.</li>
<li>
<strike>If the base and buildings around it are destroyed and the game is not over, it means that there is a building somewhere on the map that needs to be destroyed. Zerglings don't even search randomly, they just gather in one place.</strike>
</li>
<li>In general, it would be useful to write some class that allows you to give orders to units and cancel them when certain circumstances arise.</li>
</ul>
<p>The code itself is not of high quality either. This mess needs to be cleaned up. It would be useful to separate most of the code into separate classes and create some logic for implementing the build order.</p>
<p>You can see the bot in action here:</p>
<iframe src="https://www.youtube.com/embed/xvI2EuLPg6o" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen=""></iframe>
<h2 class="anchor anchorWithStickyNavbar_iC02" id="whats-next">What's next?<a class="hash-link" aria-label="Direct link to What's next?" title="Direct link to What's next?" href="https://dloranc.github.io/2017/04/09/five-pools#whats-next">​</a></h2>
<p>I more or less learned how to write a bot and learned some of the problems associated with it. Now you can finally start <strong>reinforcement learning</strong>. I think machine learning could be used in this bot to make zerglings fight better. But more about that in the next posts.</p>]]></content>
        <category label="Projects" term="Projects"/>
        <category label="DSP 2017" term="DSP 2017"/>
        <category label="Starcraft" term="Starcraft"/>
        <category label="bwapi" term="bwapi"/>
    </entry>
    <entry>
        <title type="html"><![CDATA[Change of plans]]></title>
        <id>https://dloranc.github.io/2017/04/02/change-of-plans</id>
        <link href="https://dloranc.github.io/2017/04/02/change-of-plans"/>
        <updated>2017-04-02T00:00:00.000Z</updated>
        <summary type="html"><![CDATA[I am writing a bot for Starcraft: Brood War. In today's post I explain the reasons that prompted me to make this change.]]></summary>
        <content type="html"><![CDATA[<p><em>This post is about the Starcraft bot I am developing using machine learning. The project is being developed as part of the "Daj Się Poznać 2017" competition.</em></p>
<hr>
<p>Unfortunately, Blizzard and DeepMind failed to create the API and moved the release date to summer this year, which they announced on the <a href="https://us.battle.net/forums/en/sc2/topic/20753825636" target="_blank" rel="noopener noreferrer">Battle.net forum</a>:</p>
<blockquote>
<p>We wanted to give you all an update on the progress of the StarCraft II API. Blizzard and DeepMind remain hard at work together defining the API and infrastructure needed to do world class research in StarCraft II. Like many research projects we’ve been learning a lot as we’ve gone along on this new endeavor. We’re eager to get a polished set of tools and documentation into the hands of researchers and developers as soon as possible. Originally we’d hoped to have the API ready by Q1 of this year but think it’s best to shift the official release back to this summer to provide a level of quality and completeness that we know you expect from us.</p>
</blockquote>
<blockquote>
<p>We appreciate everyone’s patience as we continue to work on the API – our goal is to bring you the best possible tool, with thorough documentation.</p>
</blockquote>
<p>Typical Blizzard :) I suspect that the API will be released at the same time as the premiere of the remastered first part of Starcraft, which is also scheduled for release this summer.</p>
<p>Therefore, I decided to start writing a bot for Starcraft: Brood War. It won't be easy, because as I wrote earlier, this is a rather unfriendly environment when it comes to machine learning. I was also thinking about writing something Starcraft-like myself, I even have something written in Phaser, but it will end with some simple examples, because I would be too embarrassed to write my own RTS from scratch. I will use this code for posts about reinforcement learning.</p>
<h2 class="anchor anchorWithStickyNavbar_iC02" id="bwapi--bwmirror">BWAPI + BWMirror<a class="hash-link" aria-label="Direct link to BWAPI + BWMirror" title="Direct link to BWAPI + BWMirror" href="https://dloranc.github.io/2017/04/02/change-of-plans#bwapi--bwmirror">​</a></h2>
<p>To write a bot for SC<!-- -->:BW<!-- --> you need to install Starcraft (who would have guessed) version 1.16.1 and <a href="https://github.com/bwapi/bwapi" target="_blank" rel="noopener noreferrer">BWAPI (The Brood War API)</a>, and as I used from the tutorial on the website <a href="http://sscaitournament.com/index.php?action=tutorial" target="_blank" rel="noopener noreferrer">Student StarCraft AI Tournament</a>, plus BWMirror, which is a Java wrapper for BWAPI. This is quite simple, I only omitted the recommended installation of Eclipse and connected the project to IntelliJ IDEA, because I already had this IDE installed. The game is launched via Chaoslauncher, which injects DLLs (or something like that, I don't know) that change the game or extract data from it.</p>
<p>It is also worth setting the following values ​​in <code>bwapi.ini</code>:</p>
<div class="language-ini codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-ini codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token key attr-name" style="color:#00a4db">auto_menu</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">=</span><span class="token plain"> </span><span class="token value attr-value" style="color:#e3116c">SINGLE_PLAYER</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token key attr-name" style="color:#00a4db">maps</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">=</span><span class="token plain"> </span><span class="token value attr-value" style="color:#e3116c">maps\sscai\*.sc?</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>This saves a lot of time because you don't have to click through individual game windows all the time.</p>
<p>By the way, it is worth unchecking W-MODE in Chaoslauncher, which allows you to run the game in a window. The <code>CTRL+F1</code> keyboard shortcut will also be useful, as it allows you to limit mouse movement to the entire game window. It is also useful to be able to enlarge the game window and its content twice by clicking the <code>2x</code> icon in the upper right corner. There is probably a keyboard shortcut for this, but I didn't feel like looking for it.</p>
<p>An example bot from the shared package looks like this:</p>
<div class="language-java codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-java codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token keyword" style="color:#00009f">import</span><span class="token plain"> </span><span class="token import namespace" style="opacity:0.7">bwapi</span><span class="token import namespace punctuation" style="opacity:0.7;color:#393A34">.</span><span class="token import operator" style="color:#393A34">*</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">import</span><span class="token plain"> </span><span class="token import namespace" style="opacity:0.7">bwta</span><span class="token import namespace punctuation" style="opacity:0.7;color:#393A34">.</span><span class="token import class-name">BWTA</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">import</span><span class="token plain"> </span><span class="token import namespace" style="opacity:0.7">bwta</span><span class="token import namespace punctuation" style="opacity:0.7;color:#393A34">.</span><span class="token import class-name">BaseLocation</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">public</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">class</span><span class="token plain"> </span><span class="token class-name">TestBot1</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">extends</span><span class="token plain"> </span><span class="token class-name">DefaultBWListener</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">private</span><span class="token plain"> </span><span class="token class-name">Mirror</span><span class="token plain"> mirror </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">new</span><span class="token plain"> </span><span class="token class-name">Mirror</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">private</span><span class="token plain"> </span><span class="token class-name">Game</span><span class="token plain"> game</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">private</span><span class="token plain"> </span><span class="token class-name">Player</span><span class="token plain"> self</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">public</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">void</span><span class="token plain"> </span><span class="token function" style="color:#d73a49">run</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        mirror</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">getModule</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">setEventListener</span><span class="token punctuation" style="color:#393A34">(</span><span class="token keyword" style="color:#00009f">this</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        mirror</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">startGame</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token annotation punctuation" style="color:#393A34">@Override</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">public</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">void</span><span class="token plain"> </span><span class="token function" style="color:#d73a49">onUnitCreate</span><span class="token punctuation" style="color:#393A34">(</span><span class="token class-name">Unit</span><span class="token plain"> unit</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token class-name">System</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">out</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">println</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">"New unit discovered "</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">+</span><span class="token plain"> unit</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">getType</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token annotation punctuation" style="color:#393A34">@Override</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">public</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">void</span><span class="token plain"> </span><span class="token function" style="color:#d73a49">onStart</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        game </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> mirror</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">getGame</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        self </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> game</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">self</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token comment" style="color:#999988;font-style:italic">//Use BWTA to analyze map</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token comment" style="color:#999988;font-style:italic">//This may take a few minutes if the map is processed first time!</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token class-name">System</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">out</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">println</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">"Analyzing map..."</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token constant" style="color:#36acaa">BWTA</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">readMap</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token constant" style="color:#36acaa">BWTA</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">analyze</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token class-name">System</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">out</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">println</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">"Map data ready"</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token keyword" style="color:#00009f">int</span><span class="token plain"> i </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">0</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token keyword" style="color:#00009f">for</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token class-name">BaseLocation</span><span class="token plain"> baseLocation </span><span class="token operator" style="color:#393A34">:</span><span class="token plain"> </span><span class="token constant" style="color:#36acaa">BWTA</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">getBaseLocations</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token class-name">System</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">out</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">println</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">"Base location #"</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">+</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token operator" style="color:#393A34">++</span><span class="token plain">i</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">+</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">". Printing location's region polygon:"</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token keyword" style="color:#00009f">for</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token class-name">Position</span><span class="token plain"> position </span><span class="token operator" style="color:#393A34">:</span><span class="token plain"> baseLocation</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">getRegion</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">getPolygon</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">getPoints</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token class-name">System</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">out</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">print</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">position </span><span class="token operator" style="color:#393A34">+</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">", "</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token class-name">System</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">out</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">println</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token annotation punctuation" style="color:#393A34">@Override</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">public</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">void</span><span class="token plain"> </span><span class="token function" style="color:#d73a49">onFrame</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token comment" style="color:#999988;font-style:italic">//game.setTextSize(10);</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        game</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">drawTextScreen</span><span class="token punctuation" style="color:#393A34">(</span><span class="token number" style="color:#36acaa">10</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">10</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">"Playing as "</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">+</span><span class="token plain"> self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">getName</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">+</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">" - "</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">+</span><span class="token plain"> self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">getRace</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token class-name">StringBuilder</span><span class="token plain"> units </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">new</span><span class="token plain"> </span><span class="token class-name">StringBuilder</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">"My units:\n"</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token comment" style="color:#999988;font-style:italic">//iterate through my units</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token keyword" style="color:#00009f">for</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token class-name">Unit</span><span class="token plain"> myUnit </span><span class="token operator" style="color:#393A34">:</span><span class="token plain"> self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">getUnits</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            units</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">append</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">myUnit</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">getType</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">append</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">" "</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">append</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">myUnit</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">getTilePosition</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">append</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">"\n"</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token comment" style="color:#999988;font-style:italic">//if there's enough minerals, train an SCV</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">myUnit</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">getType</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">==</span><span class="token plain"> </span><span class="token class-name">UnitType</span><span class="token class-name punctuation" style="color:#393A34">.</span><span class="token class-name">Terran_Command_Center</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">&amp;&amp;</span><span class="token plain"> self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">minerals</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">&gt;=</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">50</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                myUnit</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">train</span><span class="token punctuation" style="color:#393A34">(</span><span class="token class-name">UnitType</span><span class="token class-name punctuation" style="color:#393A34">.</span><span class="token class-name">Terran_SCV</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token comment" style="color:#999988;font-style:italic">//if it's a worker and it's idle, send it to the closest mineral patch</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">myUnit</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">getType</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">isWorker</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">&amp;&amp;</span><span class="token plain"> myUnit</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">isIdle</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token class-name">Unit</span><span class="token plain"> closestMineral </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">null</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token comment" style="color:#999988;font-style:italic">//find the closest mineral</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token keyword" style="color:#00009f">for</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token class-name">Unit</span><span class="token plain"> neutralUnit </span><span class="token operator" style="color:#393A34">:</span><span class="token plain"> game</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">neutral</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">getUnits</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                    </span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">neutralUnit</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">getType</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">isMineralField</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                        </span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">closestMineral </span><span class="token operator" style="color:#393A34">==</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">null</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">||</span><span class="token plain"> myUnit</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">getDistance</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">neutralUnit</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">&lt;</span><span class="token plain"> myUnit</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">getDistance</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">closestMineral</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                            closestMineral </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> neutralUnit</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                        </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                    </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token comment" style="color:#999988;font-style:italic">//if a mineral patch was found, send the worker to gather it</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">closestMineral </span><span class="token operator" style="color:#393A34">!=</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">null</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                    myUnit</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">gather</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">closestMineral</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token boolean" style="color:#36acaa">false</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token comment" style="color:#999988;font-style:italic">//draw my units on screen</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        game</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">drawTextScreen</span><span class="token punctuation" style="color:#393A34">(</span><span class="token number" style="color:#36acaa">10</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">25</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> units</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">toString</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">public</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">static</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">void</span><span class="token plain"> </span><span class="token function" style="color:#d73a49">main</span><span class="token punctuation" style="color:#393A34">(</span><span class="token class-name">String</span><span class="token punctuation" style="color:#393A34">[</span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"> args</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token keyword" style="color:#00009f">new</span><span class="token plain"> </span><span class="token class-name">TestBot1</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">run</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token punctuation" style="color:#393A34">}</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>It doesn't do much, it just lists the units on the screen and directs all the slacking units to collect minerals. The whole thing, I think, is quite self-explanatory, so there is no point in describing this code further.</p>
<h2 class="anchor anchorWithStickyNavbar_iC02" id="whats-next">What's next?<a class="hash-link" aria-label="Direct link to What's next?" title="Direct link to What's next?" href="https://dloranc.github.io/2017/04/02/change-of-plans#whats-next">​</a></h2>
<p>I'll play around with Java some more. I've even created a bot that creates units and scouts, but I haven't figured out how to build a building yet. The sample code from the previously linked tutorial does not work. That's bad. Once I create a simple 5-pool bot, I'll try machine learning using <a href="https://github.com/TorchCraft/TorchCraft" target="_blank" rel="noopener noreferrer">TorchCraft</a>, or something else, if I can find it, because I don't like writing in Lua.</p>]]></content>
        <category label="Projects" term="Projects"/>
        <category label="DSP 2017" term="DSP 2017"/>
        <category label="Starcraft" term="Starcraft"/>
    </entry>
    <entry>
        <title type="html"><![CDATA[Saving models and history in Keras]]></title>
        <id>https://dloranc.github.io/2017/03/28/saving-models-and-history-in-keras</id>
        <link href="https://dloranc.github.io/2017/03/28/saving-models-and-history-in-keras"/>
        <updated>2017-03-28T00:00:00.000Z</updated>
        <summary type="html"><![CDATA[Post about saving models and history in Keras, as well as interrupting script execution.]]></summary>
        <content type="html"><![CDATA[<p>Usually, training neural networks takes a very long time. Sometimes you need to interrupt the execution of a script when you want to do something else that is equally resource-demanding or turn off the computer. I don't like leaving my PC overnight, so I stop training the neural network and resume training the next day.</p>
<p>So I decided to write code that would allow me to interrupt the execution of the script and also save the history so that I would have it for all script executions, not just the last one. In addition, I decided that the calculations would take as many epochs as I had planned.</p>
<p>The script looks as follows and is a modification of the example from the <a href="https://github.com/fchollet/keras/blob/master/examples/mnist_mlp.py" target="_blank" rel="noopener noreferrer">Keras repository</a>.</p>
<p>Ok, time for the code. We start with imports:</p>
<div class="language-python codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-python codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token keyword" style="color:#00009f">import</span><span class="token plain"> os</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">import</span><span class="token plain"> keras</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">import</span><span class="token plain"> cPickle</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">from</span><span class="token plain"> keras</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">callbacks </span><span class="token keyword" style="color:#00009f">import</span><span class="token plain"> Callback</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">from</span><span class="token plain"> keras</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">models </span><span class="token keyword" style="color:#00009f">import</span><span class="token plain"> load_model</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token comment" style="color:#999988;font-style:italic"># other imports</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>Then I created a MyHistory class that writes <code>accuracy</code>, <code>validation accuracy</code>, <code>loss</code> and <code>validation loss</code> at the end of each epoch. I created it because I needed to extract the history from the model when the script is interrupted (the KeyboardInterrupt exception is thrown). I haven't found any other way. It's possible that this could be done in a simpler way.</p>
<div class="language-python codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-python codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token keyword" style="color:#00009f">class</span><span class="token plain"> </span><span class="token class-name">MyHistory</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">Callback</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">def</span><span class="token plain"> </span><span class="token function" style="color:#d73a49">__init__</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">self</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token builtin">super</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">Callback</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> self</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">__init__</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">history </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token string" style="color:#e3116c">'acc'</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">[</span><span class="token punctuation" style="color:#393A34">]</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">'loss'</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">[</span><span class="token punctuation" style="color:#393A34">]</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">'val_acc'</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">[</span><span class="token punctuation" style="color:#393A34">]</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">'val_loss'</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">[</span><span class="token punctuation" style="color:#393A34">]</span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">def</span><span class="token plain"> </span><span class="token function" style="color:#d73a49">on_epoch_end</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">self</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> batch</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> logs</span><span class="token operator" style="color:#393A34">=</span><span class="token punctuation" style="color:#393A34">{</span><span class="token punctuation" style="color:#393A34">}</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token keyword" style="color:#00009f">for</span><span class="token plain"> key </span><span class="token keyword" style="color:#00009f">in</span><span class="token plain"> self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">history</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">keys</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">history</span><span class="token punctuation" style="color:#393A34">[</span><span class="token plain">key</span><span class="token punctuation" style="color:#393A34">]</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">append</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">logs</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">get</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">key</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>Then I created three functions:</p>
<ul>
<li><code>load_history</code> — a function that loads the previous history saved to disk using the <code>cPickle</code> library</li>
<li><code>save_history</code> — a function that saves history (also with <code>cPickle</code>)</li>
<li><code>merge_history</code> — a function that merges the previous history with this new one</li>
</ul>
<div class="language-python codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-python codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token keyword" style="color:#00009f">def</span><span class="token plain"> </span><span class="token function" style="color:#d73a49">load_history</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">filename</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">with</span><span class="token plain"> </span><span class="token builtin">open</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">filename</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">'r'</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">as</span><span class="token plain"> </span><span class="token builtin">file</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        history </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> cPickle</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">load</span><span class="token punctuation" style="color:#393A34">(</span><span class="token builtin">file</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">return</span><span class="token plain"> history</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">def</span><span class="token plain"> </span><span class="token function" style="color:#d73a49">save_history</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">history</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">with</span><span class="token plain"> </span><span class="token builtin">open</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">'history.pkl'</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">'wb'</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">as</span><span class="token plain"> </span><span class="token builtin">file</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        cPickle</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">dump</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">history</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token builtin">file</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">def</span><span class="token plain"> </span><span class="token function" style="color:#d73a49">merge_history</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">previous</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> current</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    history </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"> key</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> previous</span><span class="token punctuation" style="color:#393A34">[</span><span class="token plain">key</span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">+</span><span class="token plain"> current</span><span class="token punctuation" style="color:#393A34">[</span><span class="token plain">key</span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">for</span><span class="token plain"> key </span><span class="token keyword" style="color:#00009f">in</span><span class="token plain"> current</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">keys</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">return</span><span class="token plain"> history</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>Then, if the model exists, I load it. I also load the previous history, calculate based on the previous history how many epochs have already been completed and run the calculations when there are still some epochs left. I interrupt the script by handling the <code>KeyboardInterrupt</code> exception.</p>
<div class="language-python codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-python codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token plain">batch_size </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">128</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">num_classes </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">10</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">epochs </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">40</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token comment" style="color:#999988;font-style:italic"># data loading and preprocessing here</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> os</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">path</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">isfile</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">'my_model.h5'</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">print</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">'Loading model...'</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    model </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> load_model</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">'my_model.h5'</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">else</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token comment" style="color:#999988;font-style:italic"># model definition and compilation here</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">previous_history </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token boolean" style="color:#36acaa">None</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> os</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">path</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">isfile</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">'history.pkl'</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    previous_history </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> load_history</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">'history.pkl'</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">previous_epochs </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">0</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> previous_history </span><span class="token keyword" style="color:#00009f">is</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">not</span><span class="token plain"> </span><span class="token boolean" style="color:#36acaa">None</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    previous_epochs </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token builtin">len</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">previous_history</span><span class="token punctuation" style="color:#393A34">[</span><span class="token string" style="color:#e3116c">'acc'</span><span class="token punctuation" style="color:#393A34">]</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">epochs </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> epochs </span><span class="token operator" style="color:#393A34">-</span><span class="token plain"> previous_epochs</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">my_history </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> MyHistory</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">history </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token boolean" style="color:#36acaa">None</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">try</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> epochs </span><span class="token operator" style="color:#393A34">&gt;</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">0</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        history </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> model</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">fit</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">x_train</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> y_train</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                            batch_size</span><span class="token operator" style="color:#393A34">=</span><span class="token plain">batch_size</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> epochs</span><span class="token operator" style="color:#393A34">=</span><span class="token plain">epochs</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                            verbose</span><span class="token operator" style="color:#393A34">=</span><span class="token number" style="color:#36acaa">1</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> validation_data</span><span class="token operator" style="color:#393A34">=</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">x_test</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> y_test</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                            callbacks</span><span class="token operator" style="color:#393A34">=</span><span class="token punctuation" style="color:#393A34">[</span><span class="token plain">my_history</span><span class="token punctuation" style="color:#393A34">]</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">else</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token keyword" style="color:#00009f">print</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">'Training completed.'</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">except</span><span class="token plain"> KeyboardInterrupt</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">print</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">print</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">'You pressed CTRL+C'</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    history </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> my_history</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">history</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">finally</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    model</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">save</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">'my_model.h5'</span><span class="token punctuation" style="color:#393A34">)</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>At the very end I merge and save the history:</p>
<div class="language-python codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-python codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> history </span><span class="token operator" style="color:#393A34">!=</span><span class="token plain"> </span><span class="token boolean" style="color:#36acaa">None</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">and</span><span class="token plain"> </span><span class="token builtin">type</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">history</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">is</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">not</span><span class="token plain"> </span><span class="token builtin">dict</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        history </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> history</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">history</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> previous_history </span><span class="token operator" style="color:#393A34">!=</span><span class="token plain"> </span><span class="token boolean" style="color:#36acaa">None</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">and</span><span class="token plain"> history </span><span class="token operator" style="color:#393A34">!=</span><span class="token plain"> </span><span class="token boolean" style="color:#36acaa">None</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        history </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> merge_history</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">previous_history</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> history</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> history </span><span class="token operator" style="color:#393A34">!=</span><span class="token plain"> </span><span class="token boolean" style="color:#36acaa">None</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">and</span><span class="token plain"> </span><span class="token builtin">len</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">history</span><span class="token punctuation" style="color:#393A34">[</span><span class="token string" style="color:#e3116c">'acc'</span><span class="token punctuation" style="color:#393A34">]</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">&gt;</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">0</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        save_history</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">history</span><span class="token punctuation" style="color:#393A34">)</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<h2 class="anchor anchorWithStickyNavbar_iC02" id="summary">Summary<a class="hash-link" aria-label="Direct link to Summary" title="Direct link to Summary" href="https://dloranc.github.io/2017/03/28/saving-models-and-history-in-keras#summary">​</a></h2>
<p>The above script managed to save the model and history so that I can stop the calculations at any time. The only thing I need to check is whether when restarting <code>val_loss</code> is taken from the saved model or whether it starts from the first script in a given execution. This is a problem because if the restarted script starts over, the old value is lost and if there is little left to the end of the calculations, a situation may arise where we end up with a model with worse weights than we should.</p>
<p>An example with working code can be found <a href="https://gist.github.com/dloranc/d7b7fbeb138e192916a7ae3a793ea477" target="_blank" rel="noopener noreferrer">here</a>.</p>]]></content>
        <category label="Przepisy" term="Przepisy"/>
        <category label="DSP 2017" term="DSP 2017"/>
        <category label="Keras" term="Keras"/>
    </entry>
    <entry>
        <title type="html"><![CDATA[We need more data - Scrapy]]></title>
        <id>https://dloranc.github.io/2017/03/12/we-need-more-data-scrapy</id>
        <link href="https://dloranc.github.io/2017/03/12/we-need-more-data-scrapy"/>
        <updated>2017-03-12T00:00:00.000Z</updated>
        <summary type="html"><![CDATA[A machine learning project requires a lot of data, which must be obtained somehow. Today I'm going to talk about using the Scrapy library for this purpose.]]></summary>
        <content type="html"><![CDATA[<p><em>This post is about the Starcraft bot I am developing using machine learning. The project is being developed as part of the "Daj Się Poznać 2017" competition.</em></p>
<hr>
<p>Every aspiring Starcraft 2 player knows that it is not enough to play to be good. You need, among other things, to analyze your games, as well as the games of other players who are better than you. It would be nice if my bot could do something like that too, at least to a limited extent. I will therefore need replays. Where to get them from? The most common way is to take them from sites like <code>spawningtool.com</code> or <code>ggtracker.com</code>, where they are published by players. Organizers of large tournaments also provide game packs from professional players, but searching the Internet to get these packs does not interest me.</p>
<p>So I decided to write a simple scrapper that goes through subpages, pulls links from tables and downloads games from <a href="http://lotv.spawningtool.com/replays/" target="_blank" rel="noopener noreferrer">spawningtool.com</a>.</p>
<h2 class="anchor anchorWithStickyNavbar_iC02" id="the-code">The code<a class="hash-link" aria-label="Direct link to The code" title="Direct link to The code" href="https://dloranc.github.io/2017/03/12/we-need-more-data-scrapy#the-code">​</a></h2>
<p>First, of course, I installed <code>Scrapy</code> (it later turned out that you would also need the <code>requests</code> library):</p>
<div class="language-sh codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-sh codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token plain">pip </span><span class="token function" style="color:#d73a49">install</span><span class="token plain"> scrapy</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>To start using <code>Scrapy</code> you can either write some simple script right away and fire it using the <code>scrapy runspider &lt;script&gt;</code> command, or create a project. I chose the latter option and ran the following command in the root directory of the project:</p>
<div class="language-sh codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-sh codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token plain">scrapy startproject replays</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>Next, in the <code>settings.py</code> file, I set the <code>pipeline</code>, which allows files to be downloaded, and also set the path to which files should be downloaded:</p>
<div class="language-python codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-python codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token plain">ITEM_PIPELINES </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">   </span><span class="token string" style="color:#e3116c">'scrapy.contrib.pipeline.images.FilesPipeline'</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">1</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">FILES_STORE </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">"./files"</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>Then, in the <code>items.py</code> file, I created a simple model with only the fields that are required for <code>Scrapy</code> to download files:</p>
<div class="language-python codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-python codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token comment" style="color:#999988;font-style:italic"># -*- coding: utf-8 -*-</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token comment" style="color:#999988;font-style:italic"># Define here the models for your scraped items</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token comment" style="color:#999988;font-style:italic">#</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token comment" style="color:#999988;font-style:italic"># See documentation in:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token comment" style="color:#999988;font-style:italic"># http://doc.scrapy.org/en/latest/topics/items.html</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">import</span><span class="token plain"> scrapy</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">class</span><span class="token plain"> </span><span class="token class-name">ReplaysItem</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">scrapy</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">Item</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    file_urls </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> scrapy</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">Field</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">	files </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> scrapy</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">Field</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>In the <code>spiders</code> directory, I created a <code>spawning-tool-spider.py</code> file with the following content:</p>
<div class="language-python codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-python codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token keyword" style="color:#00009f">import</span><span class="token plain"> scrapy</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">import</span><span class="token plain"> requests</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">from</span><span class="token plain"> replays</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">items </span><span class="token keyword" style="color:#00009f">import</span><span class="token plain"> ReplayItem</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">class</span><span class="token plain"> </span><span class="token class-name">SpawningToolSpider</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">scrapy</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">Spider</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    name </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">'spawning-tool-spider'</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token comment" style="color:#999988;font-style:italic"># first page with replays, scrapy has to start somewhere</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    start_urls </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">[</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token string" style="color:#e3116c">'http://lotv.spawningtool.com/replays/?p=&amp;query=&amp;after_time=&amp;'</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token string" style="color:#e3116c">'before_time=&amp;after_played_on=&amp;before_played_on=&amp;patch=&amp;order_by='</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token comment" style="color:#999988;font-style:italic"># scrapy running in the console fires this method and starts scraping</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">def</span><span class="token plain"> </span><span class="token function" style="color:#d73a49">parse</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">self</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> response</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token keyword" style="color:#00009f">for</span><span class="token plain"> row </span><span class="token keyword" style="color:#00009f">in</span><span class="token plain"> response</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">css</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">'table tr'</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token comment" style="color:#999988;font-style:italic"># skip the first row of the table</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> row</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">css</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">'td:last-child ::attr(href)'</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">extract_first</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">is</span><span class="token plain"> </span><span class="token boolean" style="color:#36acaa">None</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token keyword" style="color:#00009f">continue</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token keyword" style="color:#00009f">else</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                url </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">'http://lotv.spawningtool.com'</span><span class="token plain"> \</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                    </span><span class="token operator" style="color:#393A34">+</span><span class="token plain"> row</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">css</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">'td:last-child ::attr(href)'</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">extract_first</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token comment" style="color:#999988;font-style:italic"># a typical path with a download file is http://lotv.spawningtool.com/&lt;number&gt;/download/</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token comment" style="color:#999988;font-style:italic"># but when downloading there is a 302 redirect to Amazon,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token comment" style="color:#999988;font-style:italic"># so something had to be done about it, because the scrapy refused to work</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token comment" style="color:#999988;font-style:italic"># The solution turned out to be pulling out the Location field from the headers,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token comment" style="color:#999988;font-style:italic"># which is the correct address</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                request_response </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> requests</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">head</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">url</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> request_response</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">status_code </span><span class="token operator" style="color:#393A34">==</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">302</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                    url </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> request_response</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">headers</span><span class="token punctuation" style="color:#393A34">[</span><span class="token string" style="color:#e3116c">"Location"</span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token comment" style="color:#999988;font-style:italic"># split, because scrapy tried to save a file with GET field values</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token comment" style="color:#999988;font-style:italic"># (something like 322d3a25z2z.SC2Replay?key=0JIDaAJ)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                url </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> url</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">split</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">'?'</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token comment" style="color:#999988;font-style:italic"># ok, we have the path to the file</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token keyword" style="color:#00009f">yield</span><span class="token plain"> ReplayItem</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">file_urls</span><span class="token operator" style="color:#393A34">=</span><span class="token punctuation" style="color:#393A34">[</span><span class="token plain">url</span><span class="token punctuation" style="color:#393A34">[</span><span class="token number" style="color:#36acaa">0</span><span class="token punctuation" style="color:#393A34">]</span><span class="token punctuation" style="color:#393A34">]</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token comment" style="color:#999988;font-style:italic"># go to the next page</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        next_page </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> response</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">css</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">'a.pull-right ::attr(href)'</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">extract_first</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> next_page</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token keyword" style="color:#00009f">yield</span><span class="token plain"> scrapy</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">Request</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">response</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">urljoin</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">next_page</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> callback</span><span class="token operator" style="color:#393A34">=</span><span class="token plain">self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">parse</span><span class="token punctuation" style="color:#393A34">)</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>Then all you had to do was fire up the command in the console:</p>
<div class="language-sh codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-sh codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token plain">scrapy crawl spawning-tool-spider</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<h2 class="anchor anchorWithStickyNavbar_iC02" id="whats-next">What's next?<a class="hash-link" aria-label="Direct link to What's next?" title="Direct link to What's next?" href="https://dloranc.github.io/2017/03/12/we-need-more-data-scrapy#whats-next">​</a></h2>
<p>For now, however, I'll give myself a break from trying to predict what the environment from Blizzard and DeepMind will contain. I haven't even downloaded the replays because I don't know if there's any point. I think this week I'll try to recreate the scenarios of a simple micro from Starcraft for web browsers and under that I'll write examples using neural networks and reinforcement learning.</p>]]></content>
        <category label="Projects" term="Projects"/>
        <category label="DSP 2017" term="DSP 2017"/>
        <category label="Scrapy" term="Scrapy"/>
        <category label="Starcraft" term="Starcraft"/>
    </entry>
    <entry>
        <title type="html"><![CDATA[Dependencies in Python]]></title>
        <id>https://dloranc.github.io/2017/03/10/dependencies-in-python</id>
        <link href="https://dloranc.github.io/2017/03/10/dependencies-in-python"/>
        <updated>2017-03-10T00:00:00.000Z</updated>
        <summary type="html"><![CDATA[As I am not particularly proficient in Python some things are new to me. This time a post about how dependencies management look in Python compared to other languages I know.]]></summary>
        <content type="html"><![CDATA[<p>I use JavaScript and PHP on a daily basis at work. In these languages there are package managers <code>npm</code> (and more) and <code>composer</code> that allow easy dependency management for each project. So far, I've been writing fairly simple scripts in Python and didn't need any package manager. For upcoming projects, I decided to see what the deal is with dependency management in Python.</p>
<h2 class="anchor anchorWithStickyNavbar_iC02" id="pip-pip-installs-packages">pip (Pip Installs Packages)<a class="hash-link" aria-label="Direct link to pip (Pip Installs Packages)" title="Direct link to pip (Pip Installs Packages)" href="https://dloranc.github.io/2017/03/10/dependencies-in-python#pip-pip-installs-packages">​</a></h2>
<p>Pip is the standard package manager in Python. Below I list the basic commands.</p>
<div class="language-bash codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-bash codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token plain">pip </span><span class="token function" style="color:#d73a49">install</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">&lt;</span><span class="token plain">package</span><span class="token operator" style="color:#393A34">&gt;</span><span class="token plain">           </span><span class="token comment" style="color:#999988;font-style:italic"># package installation</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">pip </span><span class="token function" style="color:#d73a49">install</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">&lt;</span><span class="token plain">package</span><span class="token operator" style="color:#393A34">&gt;</span><span class="token plain"> </span><span class="token parameter variable" style="color:#36acaa">--upgrade</span><span class="token plain"> </span><span class="token comment" style="color:#999988;font-style:italic"># update a package</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">pip uninstall                   </span><span class="token comment" style="color:#999988;font-style:italic"># uninstalls packages</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">pip list                        </span><span class="token comment" style="color:#999988;font-style:italic"># displays list of installed packages</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">pip list </span><span class="token parameter variable" style="color:#36acaa">--outdated</span><span class="token plain">             </span><span class="token comment" style="color:#999988;font-style:italic"># displays list of outdated packages</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">pip show                        </span><span class="token comment" style="color:#999988;font-style:italic"># displays details about a package</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">pip freeze                      </span><span class="token comment" style="color:#999988;font-style:italic"># displays list of installed packages in requirements.txt file format</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">pip </span><span class="token function" style="color:#d73a49">install</span><span class="token plain"> </span><span class="token parameter variable" style="color:#36acaa">-r</span><span class="token plain"> requirements.txt </span><span class="token comment" style="color:#999988;font-style:italic"># installs dependencies from the requirements.txt file</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>When creating a project, it is a good idea to provide in the <code>requirements.txt</code> file a list of dependencies that are required to run the project. An example <code>requirements.txt</code> file looks as follows:</p>
<div class="codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-text codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token plain">matplotlib==1.3.1</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">numpy==1.12.0</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">scikit-learn==0.18.1</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>We can see that the list contains package names and their versions separated by <code>==</code> characters.</p>
<h2 class="anchor anchorWithStickyNavbar_iC02" id="pipreqs">pipreqs<a class="hash-link" aria-label="Direct link to pipreqs" title="Direct link to pipreqs" href="https://dloranc.github.io/2017/03/10/dependencies-in-python#pipreqs">​</a></h2>
<p>However, what if we already have a project and need to create a list of dependencies? Do we have to play around with pulling them from the list obtained with the <code>pip freeze</code> command? With help comes the <code>pipreqs</code> library.</p>
<p>We install it:</p>
<div class="language-bash codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-bash codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token plain">pip </span><span class="token function" style="color:#d73a49">install</span><span class="token plain"> pipreqs</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>And then we run the command, specifying as a parameter the directory with the project, where we want to get a list of dependencies:</p>
<div class="language-bash codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-bash codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token plain">pipreqs project_directory/</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<h2 class="anchor anchorWithStickyNavbar_iC02" id="virtualenv">virtualenv<a class="hash-link" aria-label="Direct link to virtualenv" title="Direct link to virtualenv" href="https://dloranc.github.io/2017/03/10/dependencies-in-python#virtualenv">​</a></h2>
<p>That's all nice, but the bare <code>pip</code> has one major drawback - it installs all packages in global space (it's like using <code>npm install -g &lt;package&gt;</code>. In case we have several projects and these need different versions of the same library to run we have a problem. The solution is to use the <code>virtualenv</code> library. It allows you to create isolated environments with Python and dependencies. Let's see how it works.</p>
<p>To install <code>virtualenv</code> type in the console:</p>
<div class="language-bash codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-bash codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token plain">pip </span><span class="token function" style="color:#d73a49">install</span><span class="token plain"> virtualenv</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>Install virtualenv in the project directory:</p>
<div class="language-bash codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-bash codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token plain">virtualenv </span><span class="token function" style="color:#d73a49">env</span><span class="token plain">            </span><span class="token comment" style="color:#999988;font-style:italic"># env is the name of the directory where the environment will be</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">virtualenv </span><span class="token parameter variable" style="color:#36acaa">-p</span><span class="token plain"> python3 </span><span class="token function" style="color:#d73a49">env</span><span class="token plain"> </span><span class="token comment" style="color:#999988;font-style:italic"># when we want to use Python 3</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>The env directory will be created. Let's list all files:</p>
<div class="language-shell-session codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-shell-session codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token command shell-symbol important">$</span><span class="token command"> </span><span class="token command bash language-bash function" style="color:#d73a49">ls</span><span class="token command bash language-bash"> env/</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token output">bin include lib local pip-selfcheck.json</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>The <code>env/bin/</code> directory contains what we are most interested in, namely the local version of Python and pip.</p>
<p>Ok, let's test <code>virtualenv</code>. Installing the library in <code>virtualenv</code> looks as follows:</p>
<div class="language-bash codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-bash codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token plain">env/bin/pip </span><span class="token function" style="color:#d73a49">install</span><span class="token plain"> requests</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>Let's verify that this environment is indeed isolated from the main one:</p>
<div class="language-shell-session codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-shell-session codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token command shell-symbol important">$</span><span class="token command"> </span><span class="token command bash language-bash">env/bin/pip list</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token output">appdirs (1.4.3)</span><br></span><span class="token-line" style="color:#393A34"><span class="token output">packaging (16.8)</span><br></span><span class="token-line" style="color:#393A34"><span class="token output">pip (9.0.1)</span><br></span><span class="token-line" style="color:#393A34"><span class="token output">pyparsing (2.2.0)</span><br></span><span class="token-line" style="color:#393A34"><span class="token output">requests (2.13.0)</span><br></span><span class="token-line" style="color:#393A34"><span class="token output">setuptools (34.3.1)</span><br></span><span class="token-line" style="color:#393A34"><span class="token output">six (1.10.0)</span><br></span><span class="token-line" style="color:#393A34"><span class="token output">wheel (0.29.0)</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<h2 class="anchor anchorWithStickyNavbar_iC02" id="making-it-easier-to-work-with-virtualenv">Making it easier to work with virtualenv<a class="hash-link" aria-label="Direct link to Making it easier to work with virtualenv" title="Direct link to Making it easier to work with virtualenv" href="https://dloranc.github.io/2017/03/10/dependencies-in-python#making-it-easier-to-work-with-virtualenv">​</a></h2>
<p>What if we don't want to keep typing <code>env/bin/pip</code> and similar commands pertaining to our environment?</p>
<p>First, for testing purposes, we will use the <code>which</code> command, which displays the directory where the program being run is stored:</p>
<div class="language-bash codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-bash codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token function" style="color:#d73a49">which</span><span class="token plain"> python</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>The result of calling this command will most likely be the following path (or similar):</p>
<div class="language-bash codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-bash codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token plain">/usr/bin/python</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>Ok, we change Python and the rest to that of our created environment:</p>
<div class="language-bash codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-bash codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token builtin class-name">source</span><span class="token plain"> env/bin/activate</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>The above command starts a script in the current shell instance that will replace the environment variables with those from the virtual environment.</p>
<p>Type <code>which python</code> again in the console and you get something like:</p>
<div class="language-bash codeBlockContainer_ZRvx theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_kXrl"><pre tabindex="0" class="prism-code language-bash codeBlock_AWiQ thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_Of8W"><span class="token-line" style="color:#393A34"><span class="token plain">/home/Dawid/python_projects/virtualenv_test/env/bin/python</span><br></span></code></pre><div class="buttonGroup_zAFf"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_HmkA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_CGmf"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_fJVC"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>The above command starts a script in the current shell instance that will replace the environment variables with those from the virtual environment.</p>
<p>Type <code>which python</code> again in the console and you get something like:</p>
<h2 class="anchor anchorWithStickyNavbar_iC02" id="summary">Summary<a class="hash-link" aria-label="Direct link to Summary" title="Direct link to Summary" href="https://dloranc.github.io/2017/03/10/dependencies-in-python#summary">​</a></h2>
<p>That's it, we learned the basic use of <code>pip</code> and <code>virtualenv</code>. I will write about creating packages and sending them to PyPI the next time. I also must admit that I was very surprised that <code>pip</code> does not allow the separation of libraries for each project and it requires the installation of a special package. Strange, I have different experiences when it comes to PHP and JS. I bet it's because of some historical events.</p>]]></content>
        <category label="Przepisy" term="Przepisy"/>
        <category label="DSP 2017" term="DSP 2017"/>
        <category label="Python" term="Python"/>
        <category label="pip" term="pip"/>
        <category label="virtualenv" term="virtualenv"/>
    </entry>
</feed>