<title>
<![CDATA[ Short story about optimisation ]]>
...</title>
<description>
<![CDATA[ Some time ago, I posted a screenshot on Twitter showing a flame-chart from the Profiler tool. At that time, I was working on improving the performance of an application we were developing at Egnyte… ]]>
...</description>
<link>https://przemuh.dev/en/blog/tree-performance-improvement-case-study</link>
<guid isPermaLink="false">https://przemuh.dev/en/blog/tree-performance-improvement-case-study</guid>
<pubDate>Sun, 08 Nov 2020 00:00:00 GMT</pubDate>
<content:encoded><p>Some time ago, I posted a screenshot on <a href="https://twitter.com/przemuh/status/1319595759935852544" target="_blank" rel="nofollow noopener noreferrer">Twitter</a> showing a flame-chart from the Profiler tool. At that time, I was working on improving the performance of an application we were developing at Egnyte. A certain functionality, for a large amount of data, took an incredibly long time - 3.5 minutes! During this time, the application displayed a "spinner," and the user didn't know if something was happening or if it had frozen. After a few days of working with the Profiler, I managed to implement improvements that reduced the calculation time from 3.5 minutes to 35 seconds. In this post, I would like to describe how I achieved this.</p><h2>Description of the functionality</h2><p>Let's start with a description of the functionality that, to put it briefly, was lacking in terms of performance. One of the main views shows a list of folders with files containing sensitive data. These can be credit card numbers, medical data, personal data, and more. This data can fall under one of several built-in policies, such as HIPAA, GDPR, but we also allow users to define their own policy, for example, based on a previously created dictionary. Initially, the "Sensitive Content" view showed only a flat list of folders. Last year, our Product Owner, along with the UX team, concluded that working with a flat list of folders might be inefficient. Instead, a much better, and essentially more natural way of representing data would be a folder tree.</p><p><figure class="gatsby-resp-image-figure">
<span class="gatsby-resp-image-wrapper" style="position:relative;display:block;margin-left:auto;margin-right:auto;max-width:944px">
<a class="gatsby-resp-image-link" href="/static/f622f9b398ed8964cce4a32ffc9df3fb/966a0/sc-view.png" style="display:block" target="_blank" rel="noopener">
<span class="gatsby-resp-image-background-image" style="padding-bottom:37.883959044368595%;position:relative;bottom:0;left:0;background-image:url('data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAAICAYAAAD5nd/tAAAACXBIWXMAAAsTAAALEwEAmpwYAAABWElEQVQoz01R127DMBDz/39gn4oUKYJsa1nLGmZ5clJEwOE0KIqkpsesITUrgydrrJ+K3eC+JFx8go4ramlIucD5COM8tPUwUpzbJfyvJ2MdvA9YXuUWj5QyaiVBrcjsfi24GwvDCjEi5xWORDEmpJCQ1xWWPHJ3mpWCNoZgAyVzbcZFHwJ665BhSPhN5Yu1A9v6vt9aA9kGVu4K8XS8KJxuBqe7we9V4fxwUEuG8QU2NoTccVlWfF01zvcdZwKtpwbtSeYSbsrjlzyzTZges4KiKqlZ6VE+RBRaraJwA+xacWC+lsqVtuOsc7+Uiq0mRhXxJE+i2t2yFlKNQOmRGUl3zjHHOqy5UvCjPiyL1WGZhJ0qvR+WV7GsCJRA36DPsW17t8zwIMr5yI7t40w+rlGqEM7vDHeFeigav0i7G9FS/cXoaO3ISJzdf1oel3NxIB80FJJDFP4BCJ5pdpZ5/CkAAAAASUVORK5CYII=');background-size:cover;display:block"></span>
<img class="gatsby-resp-image-image" alt="Sensitive Content List View" title="Sensitive Content List View" src="/static/f622f9b398ed8964cce4a32ffc9df3fb/966a0/sc-view.png" srcSet="/static/f622f9b398ed8964cce4a32ffc9df3fb/3cf3e/sc-view.png 293w,/static/f622f9b398ed8964cce4a32ffc9df3fb/78a22/sc-view.png 585w,/static/f622f9b398ed8964cce4a32ffc9df3fb/966a0/sc-view.png 944w" sizes="(max-width: 944px) 100vw, 944px" style="width:100%;height:100%;margin:0;vertical-align:middle;position:absolute;top:0;left:0" loading="lazy"/>
</a>
</span>
<figcaption class="gatsby-resp-image-figcaption">Sensitive Content List View</figcaption>
</figure></p><h2>Tree building algorithm</h2><p>There have been several approaches to folder trees in our project. They mainly relied on the unique <code>folderId</code> property. Unfortunately, in the case of "Sensitive Content," we couldn't use this because not all folders could contain sensitive data, and only for such folders did we receive a <code>folderId</code>.</p><div class="gatsby-highlight" data-language="text"><pre class="language-text"><code class="language-text">/Shared/A/B/
In this case, we have 3 folders (Shared, A, B), of which only B has a folderId</code></pre></div><p>There can be a multitude of such SC (Sensitive Content) locations. The performance issue appeared already at 250K locations. And it was not an exception, as confirmed by a client where we found almost <strong>a million</strong> folders. For 1M list elements, the tree-building time was 3.5 minutes. Therefore, my task was to ensure that the tree for 1M elements builds in less than 60 seconds.</p><p>Returning to the algorithm. Very simple or so it seemed :)</p><ol><li>Take the entire path and split it into fragments according to the separator, e.g., <code>/</code></li><li>Insert each folder into two structures: "tree" and "flat"</li><li>If a folder does not have a <code>folderId</code>, treat it as a meta-folder</li></ol><p>For simplicity, I omit the fact that we support different data sources, and these separators can vary greatly :) Moreover, as it later turned out, some data sources can have two folders with the same name at the same nesting level 😱. And how to distinguish them? I will also skip the fact that the resulting tree was to be presented in the form of a "sparse-tree." In short - it means that if a folder contains only one sub-folder, the parent path should be collapsed/merged.</p><div class="gatsby-highlight" data-language="text"><pre class="language-text"><code class="language-text">List:
/Shared/A/B/C
/Shared/A/B/D
-->
Tree:
/Shared/A/B
/C
/D</code></pre></div><h2>First, or rather second implementation</h2><p>As I mentioned earlier, this was not the first tree we had to display in the application. In a completely different view, we also had to create a sparse-tree and didn't want to have several different implementations. Therefore, we wrote a simple module for building and managing the tree. It was based on two small components:</p><ul><li>a <code>buildTree</code> function that took a flat array of nodes and the place (path in the tree) from which it was to insert these nodes</li><li>a "slice" from <code>redux-toolkit</code> that managed the tree structure (expanding, collapsing nodes, etc.)</li></ul><p>The entire tree, or rather these two structures "tree" and "flat," were kept in redux as follows:</p><div class="gatsby-highlight" data-language="json"><pre class="language-json"><code class="language-json"><span class="token punctuation">{</span>
tree<span class="token operator">:</span> <span class="token punctuation">{</span>
path<span class="token operator">:</span> <span class="token string">""</span><span class="token punctuation">,</span> <span class="token comment">// root</span>
children<span class="token operator">:</span> <span class="token punctuation">{</span>
<span class="token property">"Shared"</span><span class="token operator">:</span> <span class="token punctuation">{</span> <span class="token comment">// path-part or folder name as a key</span>
path<span class="token operator">:</span> <span class="token string">"/Shared"</span><span class="token punctuation">,</span>
children<span class="token operator">:</span> <span class="token punctuation">{</span>
<span class="token property">"A"</span><span class="token operator">:</span> <span class="token punctuation">{</span>
path<span class="token operator">:</span> <span class="token string">"/Shared/A"</span><span class="token punctuation">,</span>
children<span class="token operator">:</span> <span class="token punctuation">{</span><span class="token punctuation">}</span>
<span class="token punctuation">}</span>
<span class="token punctuation">}</span>
<span class="token punctuation">}</span>
<span class="token punctuation">}</span>
<span class="token punctuation">}</span><span class="token punctuation">,</span>
paths<span class="token operator">:</span> <span class="token punctuation">{</span>
<span class="token property">"/Shared"</span><span class="token operator">:</span> <span class="token punctuation">{</span>
meta<span class="token operator">:</span> <span class="token boolean">true</span><span class="token punctuation">,</span>
...nodeProps
<span class="token punctuation">}</span>
<span class="token property">"/Shared/A"</span><span class="token operator">:</span> <span class="token punctuation">{</span>
meta<span class="token operator">:</span> <span class="token boolean">false</span><span class="token punctuation">,</span>
folderId<span class="token operator">:</span> <span class="token string">"some-unique-id"</span><span class="token punctuation">,</span>
...nodeProps
<span class="token punctuation">}</span>
<span class="token punctuation">}</span>
<span class="token punctuation">}</span></code></pre></div><p>This structure is obtained from the helper function <code>buildTree</code>.</p><p>Thanks to the use of <a href="https://redux-toolkit.js.org/" target="_blank" rel="nofollow noopener noreferrer">redux-toolkit</a>, and consequently the <a href="https://github.com/immerjs/immer" target="_blank" rel="nofollow noopener noreferrer">immer</a> library, we could perform operations on the tree very easily:</p><div class="gatsby-highlight" data-language="javascript"><pre class="language-javascript"><code class="language-javascript"><span class="token keyword">const</span> initialTreeState <span class="token operator">=</span> <span class="token punctuation">{</span>
<span class="token literal-property property">initialized</span><span class="token operator">:</span> <span class="token boolean">false</span><span class="token punctuation">,</span>
<span class="token literal-property property">tree</span><span class="token operator">:</span> <span class="token punctuation">{</span>
<span class="token literal-property property">path</span><span class="token operator">:</span> <span class="token string">""</span><span class="token punctuation">,</span>
<span class="token literal-property property">children</span><span class="token operator">:</span> <span class="token punctuation">{</span><span class="token punctuation">}</span><span class="token punctuation">,</span>
<span class="token punctuation">}</span><span class="token punctuation">,</span>
<span class="token literal-property property">paths</span><span class="token operator">:</span> <span class="token punctuation">{</span><span class="token punctuation">}</span><span class="token punctuation">,</span>
<span class="token punctuation">}</span>
<span class="token keyword">export</span> <span class="token keyword">const</span> <span class="token function-variable function">createTreeSlice</span> <span class="token operator">=</span> <span class="token parameter">treeName</span> <span class="token operator">=></span>
<span class="token function">createSlice</span><span class="token punctuation">(</span><span class="token punctuation">{</span>
<span class="token literal-property property">name</span><span class="token operator">:</span> treeName<span class="token punctuation">,</span>
<span class="token literal-property property">initialState</span><span class="token operator">:</span> initialTreeState<span class="token punctuation">,</span>
<span class="token literal-property property">reducers</span><span class="token operator">:</span> <span class="token punctuation">{</span>
<span class="token function-variable function">insertTree</span><span class="token operator">:</span> <span class="token punctuation">(</span><span class="token parameter"><span class="token punctuation">{</span> tree<span class="token punctuation">,</span> paths <span class="token punctuation">}</span><span class="token punctuation">,</span> <span class="token punctuation">{</span> payload <span class="token punctuation">}</span></span><span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token punctuation">{</span>
paths <span class="token operator">=</span> <span class="token punctuation">{</span>
<span class="token operator">...</span>paths<span class="token punctuation">,</span>
<span class="token operator">...</span>payload<span class="token punctuation">.</span>paths<span class="token punctuation">,</span>
<span class="token punctuation">}</span>
<span class="token keyword">const</span> node <span class="token operator">=</span> <span class="token function">getNodeByPath</span><span class="token punctuation">(</span>payload<span class="token punctuation">.</span>parentPath <span class="token operator">||</span> <span class="token string">""</span><span class="token punctuation">,</span> tree<span class="token punctuation">)</span>
node<span class="token punctuation">.</span>children <span class="token operator">=</span> payload<span class="token punctuation">.</span>tree<span class="token punctuation">.</span>children
<span class="token punctuation">}</span><span class="token punctuation">,</span>
<span class="token function-variable function">toggleNode</span><span class="token operator">:</span> <span class="token punctuation">(</span><span class="token parameter"><span class="token punctuation">{</span> tree <span class="token punctuation">}</span><span class="token punctuation">,</span> <span class="token punctuation">{</span> <span class="token literal-property property">payload</span><span class="token operator">:</span> path <span class="token punctuation">}</span></span><span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token punctuation">{</span>
<span class="token keyword">const</span> node <span class="token operator">=</span> <span class="token function">getNodeByPath</span><span class="token punctuation">(</span>path<span class="token punctuation">,</span> tree<span class="token punctuation">)</span>
node<span class="token punctuation">.</span>expanded <span class="token operator">=</span> <span class="token operator">!</span>node<span class="token punctuation">.</span>expanded
<span class="token punctuation">}</span><span class="token punctuation">,</span>
<span class="token punctuation">}</span><span class="token punctuation">,</span>
<span class="token punctuation">}</span><span class="token punctuation">)</span></code></pre></div><p>The helper function <code>getNodeByPath</code> is used to search for a node by path. It can also search for a node in a sparse-tree.</p><h2>First attempts and first mistakes</h2><p>And everything was going smoothly, but then a client with 1 million folders came, and boom... The Product Owner sets up an Epic in Jira titled "Support 1M folders on SC tree view." A quick brainstorming session and a list full of ideas right away:</p><ul><li>maybe build the tree on the fly, as we parse JSON?</li><li>maybe build the tree in a web-worker, at least we won't block the main thread for 3.5 minutes?</li><li>or maybe just drop everything and becoma a farmer? ⛰ 🐑</li></ul><p><img src="https://media.giphy.com/media/kPtv3UIPrv36cjxqLs/giphy.gif" alt="Or maybe..."/></p><p>The first mistake - no one even started the Profiler to see what was taking so long. Everyone assumed that the current tree implementation was top-notch and couldn't be better. The Profiler itself, at first glance, is not a simple tool, and maybe that was the reason we jumped to ideas like building the tree "on the fly" or moving it to a web-worker.</p><p>You're probably wondering - but how on the fly? After all, when a request is made, only when the response comes does the browser parse the JSON and provide the response. Yes, but... in this case, our backend developers also had to work a bit on optimization, and instead of returning full data, they started returning our SC list as a stream. Thanks to this, we could, for example, use the <code>oboe.js</code> library to parse JSON on the fly.</p><p>Of course, I tried this approach because, after all, someone wrote it into the Jira task, so it had to be checked, right? 😜 Cool, the JSON was parsing "on the fly," but the stream lasted 30s instead of 10s, and I hadn't even started building the tree yet. So I gave up and decided to look elsewhere.</p><h2>Web-worker</h2><p>I also tested the approach with a web-worker. But I encountered a completely different problem. Okay - I can download 1M elements and build a tree based on them, but I have to send it later from the web-worker to the main thread. The tree structure is quite extensive, along with the data we saved in this flat <code>paths</code> structure. If we want to send such large data from one thread to another, the browser has to serialize the data, send it, and then parse it again. This also caused the browser to "freeze" during the transfer from one memory location to another. Of course, there are ways to send "directly" (without copying) through so-called Transferable Objects, e.g., ArrayBuffer, but I decided that for now, it might not be worth the effort and decided to check if our tree implementation was as great as we thought 😜</p><h2>Profiler</h2><p>I sat in front of the computer screen, launched the dev-tools, and pressed the "record" button in the Profiler. After a while, I got a colorful graph that reminded me of the defragmenter times of Windows 98 🤣</p><p><figure class="gatsby-resp-image-figure">
<span class="gatsby-resp-image-wrapper" style="position:relative;display:block;margin-left:auto;margin-right:auto;max-width:1170px">
<a class="gatsby-resp-image-link" href="/static/08d444b93bebf165ed87c320556ad7b0/d9ed5/flame-graph-violet.png" style="display:block" target="_blank" rel="noopener">
<span class="gatsby-resp-image-background-image" style="padding-bottom:62.45733788395904%;position:relative;bottom:0;left:0;background-image:url('data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAAMCAYAAABiDJ37AAAACXBIWXMAABYlAAAWJQFJUiTwAAAB5ElEQVQoz32S63KiQBCFef8X2t97qYq1sWIUXW8gJgICQWBuwIDxbA8Y12xMpuqrmb7MmaZpy/NcbJ89uJ6HjbeCH/sIkgj7FyJNEEUbpHsbeby8kEWLnv2cWCCncxIs4LoLWFo3aJoGdV13VFXV2W/rdAPcsM0qKw0ryzOoqoQs36ggpADPI0iCxx5YuH5HEaxQhCs6r3rb7HsHjPKtQ1qQYAtZHS8IqTGPhrCzn5j9/o6nH0OEgzGCuxF8wuzh3bhn8NjF/MEDQnsKS2RHlPUr1BVlfQLLGhRJjdWji8l0CsfdwNl6WCZrzF+WGLsTrGcuFmMHc3uJJe35H2kE23di16IG8VAhHdHPcQIc/AxCa3BdY+dv8DSy4dkzeKMZ4vsQbCNgcdZ8qPBC9QrmVJBDDTUhti3K9gTqCgq3RPqtQD4owH4JyHvq/c5UyL8QJEQowTwF7tOFlEQ1+RvyJxpsUoIHdQdzJVRUk6BoPxUzl0VqXhYQMVWaGcFT55c53ds1kFSQzKkNvoA60NhIRf2ipGtU3dOdJQkbobyBYm3vM4hjjyQ4TcmhpngDi3ENoRpwpc/Qi90YnRHnKgziX8y0Sqq2h75SMpNDFe52AeI4QcEYuKCB5tRgqf5D3vB9zBFC4i+gT4+kJsCwBQAAAABJRU5ErkJggg==');background-size:cover;display:block"></span>
<img class="gatsby-resp-image-image" alt="Flame graph" title="Flame graph" src="/static/08d444b93bebf165ed87c320556ad7b0/105d8/flame-graph-violet.png" srcSet="/static/08d444b93bebf165ed87c320556ad7b0/3cf3e/flame-graph-violet.png 293w,/static/08d444b93bebf165ed87c320556ad7b0/78a22/flame-graph-violet.png 585w,/static/08d444b93bebf165ed87c320556ad7b0/105d8/flame-graph-violet.png 1170w,/static/08d444b93bebf165ed87c320556ad7b0/28884/flame-graph-violet.png 1755w,/static/08d444b93bebf165ed87c320556ad7b0/92bee/flame-graph-violet.png 2340w,/static/08d444b93bebf165ed87c320556ad7b0/d9ed5/flame-graph-violet.png 2880w" sizes="(max-width: 1170px) 100vw, 1170px" style="width:100%;height:100%;margin:0;vertical-align:middle;position:absolute;top:0;left:0" loading="lazy"/>
</a>
</span>
<figcaption class="gatsby-resp-image-figcaption">Flame graph</figcaption>
</figure></p><p>The first thing that caught my eye was this purple color, which dived very, very deep. Upon closer inspection, it turned out that a lot of these purple elements were the work of <code>immer.js</code>. A quick glance at the documentation and boom! A bullseye. It turns out that when "inserting" a large amount of data through <code>immer</code>, we can speed up this process through <code>Object.freeze</code> <a href="https://immerjs.github.io/immer/docs/performance#performance-tips" target="_blank" rel="nofollow noopener noreferrer">more info here</a>. This procedure allowed me to go from 12.54s to 11.24s for 54K elements. For 1M, the jump was, of course, proportionally larger. But it still wasn't it...</p><h2>From profiler to sources</h2><p>Did you know that if you click on a block in the Profiler and then move to the file, you get times for individual code blocks? No!? 😎 Now you know ;)</p><p><figure class="gatsby-resp-image-figure">
<span class="gatsby-resp-image-wrapper" style="position:relative;display:block;margin-left:auto;margin-right:auto;max-width:1170px">
<a class="gatsby-resp-image-link" href="/static/1ba2423b674f0cab158baa993f4c7cfd/0d0e4/source-before.png" style="display:block" target="_blank" rel="noopener">
<span class="gatsby-resp-image-background-image" style="padding-bottom:90.7849829351536%;position:relative;bottom:0;left:0;background-image:url('data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAASCAYAAABb0P4QAAAACXBIWXMAABYlAAAWJQFJUiTwAAACTklEQVQ4y4VUa5PaMAzk//+29kOnTO8KB4RHICTYie3Ez5Ct7IO0PMopeBwie0falTSRqkXFGGoh0LYe2+0eH/M1lssa63WDLDthnx+hlIF3Dn3f43w+36zrtxACJl3XQcsCtmOIFh2y1gQksVhU2O0O4ExA64DgA4bhjP9ZvDtp2xaKbwn0SIcvDhPQMAfOLUXVw1mHrvOUgUUMoO/PGOIz/F0joNYaHV/BqhwXPHi6/Gu6x2y2x6li8N6n7/cA93sCdMTL1ZLj4pRKYTaf4+19j80mcihvAK7vD4DGGMjiBzr2ezygSahDecKBNZCdgJQCQsp04VlkD4Bs+w0s//lPlEDTcJQlx3JFHCv3FOBphJFDUUyhTvOrmxwDdrlAtqpwLDiEMNDGJ98nKy9Sjqq1ooRp6xEw/jhT2GQM5VHDmv7m8oAXKVtrEZyhGrOjwzqLuuVoDaWrLQz5bDBwvX2p9phyW72n0rmaMxbZusL0bYPVJkNeMCzof0kixW645+8GUFF5GAKIB2+4IR6LJcfig2FFe76tqS4FlRApLy0cFfzTCCNgoMK9SSH1Z6BIA5T0SWXdORLH0btPbej9C8CWLWHkfkwj0GEXgURPAJYikrQiUP/Qvw8pR5Xr/Hvi8Won3iHLacqUBRV4Q/UYI9SIfGttXosSVTaqomnDx+FgiYJTXYPVPIFbS5MmOMQ2ve/rp53ijYS33UWLgYaDBSs18p1CwxmOJUNVEdeh/7pTUh16gz7YNCgNdYTVJBKpHMKnOHH3/nzp5eFlyn8A2CB77hemAIcAAAAASUVORK5CYII=');background-size:cover;display:block"></span>
<img class="gatsby-resp-image-image" alt="Times before optimization for buildTree" title="Times before optimization for buildTree" src="/static/1ba2423b674f0cab158baa993f4c7cfd/105d8/source-before.png" srcSet="/static/1ba2423b674f0cab158baa993f4c7cfd/3cf3e/source-before.png 293w,/static/1ba2423b674f0cab158baa993f4c7cfd/78a22/source-before.png 585w,/static/1ba2423b674f0cab158baa993f4c7cfd/105d8/source-before.png 1170w,/static/1ba2423b674f0cab158baa993f4c7cfd/0d0e4/source-before.png 1230w" sizes="(max-width: 1170px) 100vw, 1170px" style="width:100%;height:100%;margin:0;vertical-align:middle;position:absolute;top:0;left:0" loading="lazy"/>
</a>
</span>
<figcaption class="gatsby-resp-image-figcaption">Times before optimization for buildTree</figcaption>
</figure></p><p>What stands out is 229ms for building a simple string 🤯 which is the current path. It turned out that this simple oversight could be replaced with a shorter piece of code, which ultimately takes 1.7ms.</p><p><figure class="gatsby-resp-image-figure">
<span class="gatsby-resp-image-wrapper" style="position:relative;display:block;margin-left:auto;margin-right:auto;max-width:1170px">
<a class="gatsby-resp-image-link" href="/static/831c9d5d2fd63e39d71b3e58d9daf5e6/7be33/source-after.png" style="display:block" target="_blank" rel="noopener">
<span class="gatsby-resp-image-background-image" style="padding-bottom:100.34129692832765%;position:relative;bottom:0;left:0;background-image:url('data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAAUCAYAAACNiR0NAAAACXBIWXMAABYlAAAWJQFJUiTwAAACr0lEQVQ4y41UiZKbMBTb//+7trOdZtPN5iKBcBp8c6iy0yXXTloYD8R2hCzpvRelFLQ4wHYplPZIkgyL32ssVyV2iUSWdlivCz4LOGvR955jwDB8PV6MsdD1Bm3xgWkCvPeoig4fqxrfvudYLrfY7zJUlYI1DtM44NlFQA2jGg4BTH9nhxFOedSNg1IWRlt466H1xN8a3rm4bSKD+/EiOsU/FzCyiAwD5uQGNCeL7baF4nrbdmgagaKoECSqa55K9zPoDcNWGvRO8qtqXgwMDukJq+SEtMyprYYQBmUlIaWjLCN1HL8GDEduixWqdHneEO5xgiaTPQEXbwLHY8cTeMhMw1KKz+se7C+ggaIpTb662SSlxm6bYL9X2O06zpPRGDS51exaywjoeLzp7otD36OhTkI8d/TLI2vqo6oVo7O+bCCZvKzwyjyumcvkVOKQH7E+JjgUOaqO5niaZdtHhsE1Vb7z2B+XRd5SSbrZMX8WFdkGp8tSxjyG7F6zewA0XY7BNvOk8xZZWWKf58ibE2rZ4NQ0aExN8v4hg9fgPLKByF5RHn5cFjiUcHhf0OVfW2w2jNCqYC4bZJmMYb9Uwa2OkaG3ikbYGexzXRYKGR3e7PhMa7otCNzy2TLo53APw20eX6SUcM7GWFwfoafTzg/Uq4dl2Xnf833gO+fdSLDz+zjeAcZu0+xh2sNVKbMSFAG6Hq3w6GiIECrWtnP90+jE2LTZTzq9mCe1djjmNeNS0ZwqOl3Xjgx9ZBTKLoxrdjPDWCnlG7ribQZ07C5FlmO9o26JoL4DusC2NXy2ZOuihqE3PgBaNk1nTRyfkyMbZV2ZWHZdG5gZtjAaN/T/VykBbODX5tJz7H1kkaaaLcuwrm00KXTk6V+1HBgG68+JmaKbsrOxEVy0mp52mOvrD45MFooaxFLvAAAAAElFTkSuQmCC');background-size:cover;display:block"></span>
<img class="gatsby-resp-image-image" alt="Times after optimization for buildTree" title="Times after optimization for buildTree" src="/static/831c9d5d2fd63e39d71b3e58d9daf5e6/105d8/source-after.png" srcSet="/static/831c9d5d2fd63e39d71b3e58d9daf5e6/3cf3e/source-after.png 293w,/static/831c9d5d2fd63e39d71b3e58d9daf5e6/78a22/source-after.png 585w,/static/831c9d5d2fd63e39d71b3e58d9daf5e6/105d8/source-after.png 1170w,/static/831c9d5d2fd63e39d71b3e58d9daf5e6/7be33/source-after.png 1558w" sizes="(max-width: 1170px) 100vw, 1170px" style="width:100%;height:100%;margin:0;vertical-align:middle;position:absolute;top:0;left:0" loading="lazy"/>
</a>
</span>
<figcaption class="gatsby-resp-image-figcaption">Times after optimization for buildTree</figcaption>
</figure></p><p>You might think - (ironically) wow... 227ms... "bravo" 👏. What is 227ms? If we look at it as a single value - indeed... micro-optimization. But remember, the goal was to handle 1M elements, and the path concatenation operation concerned each sub-folder.</p><h2>Spread operator AKA Object.assign</h2><p>How to make a shallow copy of an object or extend another object - nothing simpler - spread operator <code>...</code>. If you have to support browsers like IE11, you probably use babel.js - just like us... and such a spread operator, in the end, is translated to <code>Object.assign</code> (*big simplification).</p><p><code>Object.assign</code> is <a href="https://twitter.com/dan_abramov/status/980436488860196864" target="_blank" rel="nofollow noopener noreferrer">relatively slow</a> and can cause problems at a larger scale. In this case, I opted for simple key-by-key copying. Thanks to this simple procedure, I reduced 154ms to 44ms. And again, for individual elements, it doesn't matter at all, but when iterating over a large data set, such optimizations can work wonders.</p><p><figure class="gatsby-resp-image-figure">
<span class="gatsby-resp-image-wrapper" style="position:relative;display:block;margin-left:auto;margin-right:auto;max-width:1170px">
<a class="gatsby-resp-image-link" href="/static/7bed4f76daac0b292144d25c8a68996a/35252/dan.png" style="display:block" target="_blank" rel="noopener">
<span class="gatsby-resp-image-background-image" style="padding-bottom:93.85665529010238%;position:relative;bottom:0;left:0;background-image:url('data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAATCAYAAACQjC21AAAACXBIWXMAABYlAAAWJQFJUiTwAAACkklEQVQ4y51U227aQBD1/39NX9qqUtWnSn2okjTENgZsCGBifL97fQGfzgwQkUppkyIddjy7e+bM7Oxqxs0PzCc/sd15WNg21psthuGA43EkHN+G8bRWtR20758+4ObbF+RNhyhOEKcZ/u83oichWpYXcJ88PD6usSF1W9fFbvck9nqzwXq9hk3KfT/AdusStlitHml+I+A1aXYSMRyIcLlcYjKZwDRNGIYBy7KwIt98PhefTj5zOsXDg06+BaY0z7ZhmLi5vZMAByISQlZoL1e4vZ9ANy1YCwfGzIY5t/FLN2GSPZ3NsVjYUl8ebceR0TSncJwldN2AQwI4E9W20KyoxZ1XwwoV9EBhRt+L+AQn7ZCrAeNV4RnjlT0MgygUm1Mu2wPyHij7EQXZPPk+HAUcpCdy7UjO0HORJTFU00jE96LvezS0V9pmpGJ+/fgZ89kCCbVMWZao6xpVVcvCt/xYaVGUUB0RssyqbpBlOTkLRFFMdoY0TWW8IKVgGW2KixZ5Wcl8TH3LRJwu/Z0UciHDMILn7eEHgZwUK+soGqPvh5c2tca1j1OWtr4Q8iSf0EW653nI8/w5HU7f930opbDf79HQdxCE8v3injwrpIglpbCnTZwuq+U68gJGVVWSLgfmeVYVRpGMF6KXhFLQCt7eJ9JADka1vaAl8NVkf90oPHk+ipLXBijp0F4lrCStAAGp4wciSVK5n0yeE2GcJKLQp1Rrao+QlLbtKwq5brxxs90RXBldesp4rGqFnE4xilM0ilINY1HKgRvV/vHYnAm7jk/0BI7aPn8PZ9/ZL7i2O1zvZXBQjQk4Kke8HhmsUJ2D/AsSnERoFW3kRmWClOolTU6HxEiyQu7npUZ/w+U9/A1xprNXkDEExgAAAABJRU5ErkJggg==');background-size:cover;display:block"></span>
<img class="gatsby-resp-image-image" alt="Dan Abramov on Object.assign" title="Dan Abramov on Object.assign" src="/static/7bed4f76daac0b292144d25c8a68996a/105d8/dan.png" srcSet="/static/7bed4f76daac0b292144d25c8a68996a/3cf3e/dan.png 293w,/static/7bed4f76daac0b292144d25c8a68996a/78a22/dan.png 585w,/static/7bed4f76daac0b292144d25c8a68996a/105d8/dan.png 1170w,/static/7bed4f76daac0b292144d25c8a68996a/35252/dan.png 1204w" sizes="(max-width: 1170px) 100vw, 1170px" style="width:100%;height:100%;margin:0;vertical-align:middle;position:absolute;top:0;left:0" loading="lazy"/>
</a>
</span>
<figcaption class="gatsby-resp-image-figcaption">Dan Abramov on Object.assign</figcaption>
</figure></p><h2>Value aggregation</h2><p>After "tuning" immer and removing a few <code>Object.assign</code> calls, or rewriting them into a simple loop, I ran out of ideas for "simple" optimizations. It was necessary to tweak the way the tree was built.</p><p>The previous implementation divided SC locations by source and built a sub-tree for each of them. For each sub-tree, aggregated values were calculated (e.g., if a folder itself contained 10 SC, but also had 50 sub-folders, we wanted to show the summed values). For each such sub-tree, summations were performed, and then the source node was updated according to the summed values for all folders.</p><p>Each such operation put something into the redux state. I thought - who needs it? Who needs it? After all, we don't show the tree until everything is calculated and updated. Therefore, I changed the code so that the entire tree, along with calculated aggregated values, is first built in memory, and then with one operation, the built tree is inserted into redux.</p><p>Moreover - in the tree requirements, it was written that certain nodes were to be expanded by default, e.g., the first level + potentially previously selected item on the list (you can go from the list to the tree with a simple button). Previously, expansion operations were triggered by the <code>toggleNode</code> action. I changed that too - instead of triggering a redux action, I simply change the <code>expanded</code> value to <code>true</code> directly in the node object.</p><p>You might say - numbers please! :)</p><p>For 54K elements, I went from 12.25s to 2.4s 🚀</p><p>Product Owner is over the moon.</p><p><img src="https://media.giphy.com/media/ciwIz38tlvDFH08Yuu/giphy.gif" alt="Wow"/></p><h2>Tests for 1M</h2><p>I asked the backend developers to prepare an environment for testing 1M elements. I wanted to see if my optimizations for 54K would be justified. And the smile didn't leave my face :)</p><p>Before optimization, the tree-building time was ~3.5 minutes. After applying the above-mentioned changes, it was reduced to 59s.</p><p>Approximately ~70% savings. In total, one could say - job done - it was supposed to build in less than 60s... 59 is less than 60 😅 It's all good...</p><p>I was a bit tired of digging but a teammate rightly pointed out:</p><blockquote><p>Well, nice, nice, but for me, it's still slow.</p></blockquote><p>He also added later that it doesn't take anything away from me, and in his opinion, I did a great job... but it was hard not to agree with him. From the moment of clicking on the navigation element to the time the view was displayed, the user had to wait a total of 90s:</p><ul><li>25s data retrieval (streaming)</li><li>6s browser parsing JSON</li><li>59s tree building</li></ul><p>As a user, if I saw only a spinner ("spinner") for 90s, I would be furious :) I don't want to think about what our users felt when they had to wait 3.5 minutes... probably none of them lasted 😅</p><h2>ID generation</h2><p>I dived into the Profiler again. For meta-folders, a <code>folderId</code> was generated. This was because another place in the code needed this <code>id</code> (never mind). In the end, the generated id meant nothing (it was never sent to the backend). However, someone came up with the idea that this meta-folder-id should be a hash of the path...</p><div class="gatsby-highlight" data-language="javascript"><pre class="language-javascript"><code class="language-javascript"><span class="token keyword">export</span> <span class="token keyword">const</span> <span class="token function-variable function">createUniqueIdForLocation</span> <span class="token operator">=</span> <span class="token parameter">path</span> <span class="token operator">=></span> <span class="token function">btoa</span><span class="token punctuation">(</span><span class="token function">encodeURIComponent</span><span class="token punctuation">(</span>path<span class="token punctuation">)</span><span class="token punctuation">)</span></code></pre></div><p>The <code>btoa</code> function encodes a string as base64. It takes an average of 0.25ms... which is a fraction of a millisecond. But when you think about it more - who needs this hash? who needs it?</p><p><img src="https://media.giphy.com/media/s239QJIh56sRW/giphy.gif" alt="But why?"/></p><p>Exactly! If the meta-folder-id is just base64 of the path, which in fact also contained the source <code>id</code>, so it was unique concerning the entire list, then why even bother with this whole hash?</p><div class="gatsby-highlight" data-language="text"><pre class="language-text"><code class="language-text">- id: createUniqueIdForLocation(path),
+ id: path,
name: getLocationName(path),</code></pre></div><p>This one diff made me go from 59s to 35s for 1M elements, which gave ~40% gain 🤯</p><p>So now the client no longer waited 90s but 66s - including data retrieval and parsing! Considering that the requirements stated that the tree should be built in less than 60s, the Product Owner and clients should be satisfied 😅</p><h2>Next steps</h2><p>Of course, we don't rest on our laurels. Blocking the user for 60s is still a bad idea, so we continue to think about improving the implementation. Maybe we'll finally throw it into a web-worker. Who knows? Maybe I'll manage to gather material for the next post 😉.</p><h2>Summary</h2><p>Lesson one - instead of guessing it's better to measure.</p><p>Lesson two - if you operate on a large scale, iterate over a large data set, optimizations at the <code>ms</code> level for one iteration can work wonders 🚀</p><p>Lesson three - put into redux only when you're ready 💪</p><p>Lesson four - if there's no need, don't complicate the situation 😉 (see ID & btoa).</p><p>I hope that thanks to this story, you'll reach for the Profiler earlier and manage to improve the performance of more than one application.</p></content:encoded>