<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/">
  <channel>
    <title></title>
    <link>https://blog.ragecage64.com/</link>
    <description>Recent content on </description>
    <generator>Hugo -- gohugo.io</generator>
    <language>en-us</language>
    <lastBuildDate>Thu, 05 Feb 2026 00:00:00 +0000</lastBuildDate>
    <atom:link href="https://blog.ragecage64.com/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>Why Large OpenTelemetry Collector Binary Size Increases Memory Usage (And Why It&#39;s Not As Bad As It Looks)</title>
      <link>https://blog.ragecage64.com/blog/go-binary-size-mem-usage/</link>
      <pubDate>Thu, 05 Feb 2026 00:00:00 +0000</pubDate>
      <guid>https://blog.ragecage64.com/blog/go-binary-size-mem-usage/</guid>
      <description>&lt;p&gt;This is a written version of a phenomenon I have known about for quite some time, and explained in verbal form in direct conversation and in a couple public talks. The need has come up a few times now to directly reference the details of this phenomenon, but linking out to YouTube links with timestamps where I (often clunkily) verbally explain the issue doesn&amp;rsquo;t feel great. That&amp;rsquo;s why I&amp;rsquo;m putting together this quick post to demonstrate the problem and explain the deeper details.&lt;/p&gt;</description>
      <content:encoded><![CDATA[<p>This is a written version of a phenomenon I have known about for quite some time, and explained in verbal form in direct conversation and in a couple public talks. The need has come up a few times now to directly reference the details of this phenomenon, but linking out to YouTube links with timestamps where I (often clunkily) verbally explain the issue doesn&rsquo;t feel great. That&rsquo;s why I&rsquo;m putting together this quick post to demonstrate the problem and explain the deeper details.</p>
<h2 id="experiment">Experiment</h2>
<p>To set this up I spun up a plain Debian VM and downloaded <code>otelcol-contrib</code> and <code>otelcol</code> from <a href="https://github.com/open-telemetry/opentelemetry-collector-releases"><code>opentelemetry-collector-releases</code></a>, both from the recent v0.145.0 release. If you&rsquo;re not familiar with how these distributions are put together, they are actually built from the exact same upstream Collector library and component code. The only difference between them is the components included; <code>otelcol-contrib</code> contains every component available in the <code>opentelemetry-collector</code> and <code>opentelemetry-collector-contrib</code> repositories.</p>
<p>I ran the two collectors at the same time on my machine, using <a href="https://gist.github.com/braydonk/fdab97351da89780f01b672c5248ef86">this config</a> in each (technically two copies of it with modified port numbers in each so they could run simultaneously). If you&rsquo;re following along, the command I used to run the collector in the background with redirected output was <code>./otelcol-contrib --config config.yaml &gt; output.log 2&gt;&amp;1 &amp;</code> (and same with <code>otelcol</code> but using <code>config2.yaml</code> and <code>output2.log</code>).</p>
<p>Once these were both running, I checked <code>htop</code>, pressing <code>F4</code> with the filter <code>otelcol</code> so we can look at both processes. I got the following result:</p>
<p><img src="/htop_otelcol.png" alt="htop screenshot showing otelcol-contrib using 4.4% of system memory, whereas otelcol is only using 1.9%"></p>
<p><code>otelcol-contrib</code> is using 4.4% of system memory, with <code>otelcol</code> only using 1.9%, basically half. The <code>RES</code> and <code>SHR</code> values for each of these processes reflects that difference directly. Why is it that two processes largely built from the same code, running the same codepaths since they are on the same config, have such a large amount of difference in memory usage?</p>
<h2 id="investigating-system-level-process-memory-usage">Investigating System-level Process Memory Usage</h2>
<p>Usually if I was investigating memory performance of a Go program, I would start with <code>pprof</code>. I&rsquo;ve done this investigation before, so I know it&rsquo;s actually less interesting to start with direct profiling. If you want to see that analysis see the Optional Colour section <a href="#pprof-heap-analysis"><code>pprof</code> heap analysis</a> (though you may want to read the rest of the article first).</p>
<h3 id="looking-at-procpidstatus">Looking at <code>/proc/[pid]/status</code></h3>
<p>So for starters, we&rsquo;ll dive into more general Linux process memory statistics. We can start with <code>/proc/[pid]/status</code> to look at the memory values. The values in this file are often victim of approximation that makes the values slightly inaccurate, but we can still gather the trend of important numbers from it:</p>





<pre tabindex="0"><code>$ cat /proc/$(pidof otelcol-contrib)/status
Name:   otelcol-contrib
...
VmPeak:  1571824 kB
VmSize:  1571824 kB
VmLck:         0 kB
VmPin:         0 kB
VmHWM:    177340 kB
VmRSS:    177340 kB
RssAnon:           30976 kB
RssFile:          146364 kB
RssShmem:              0 kB</code></pre><p>Explaining the rows of interest:</p>
<ul>
<li><code>VmSize</code>: This is the amount of virtual address space requested by the process. This is not a reflection of actual memory in use by the process, rather the virtual address space it has requested from the kernel. This is very high, but it&rsquo;s not terribly controversial in my purposely contrived example. Go&rsquo;s memory manager will always request hefty amounts of virtual address space.</li>
<li><code>VmRSS</code>: RSS is short for &ldquo;Resident Set Size&rdquo;. This means it&rsquo;s the amount of actual space in RAM the process is currently using. This is the same as the <code>RES</code> column in <code>htop</code>. It is the primary number where we see a large gap between <code>otelcol-contrib</code> and <code>otelcol</code>. However, RSS doesn&rsquo;t just mean &ldquo;memory the program has allocated&rdquo;. A process in Linux needs space in RAM for more things than just application memory, such as (crucially) executable code and data from the binary itself. More on this later.</li>
<li><code>RssAnon</code>: This usually <em>is</em> a better indication of memory used by the process that were requested by application code or operation in some way. There are some cases where anonymous mappings aren&rsquo;t just standard heap allocations from the program, but in our case these anonymous mappings are taken up a fair bit by Go runtime heap allocations.</li>
<li><code>RssFile</code>: This an indication of how much of the space in RAM is taken up by data backed by files. If a process needs to read data from a file on disk, it will request memory from the kernel to hold while there&rsquo;s available memory space to allow for reading that data directly from RAM instead of having to go out to disk every time. This is called &ldquo;memory mapping&rdquo; the file (read more at <a href="https://man7.org/linux/man-pages/man2/mmap.2.html"><code>man mmap</code></a>). As we can see, this is a large portion of the overall resident set size of <code>otelcol-contrib</code> currently.</li>
<li><code>RssShmem</code>: This is an account of how much of the resident set size the process is holding is shared by other processes in RAM. In our case, none of the memory held by <code>otelcol-contrib</code> is shareable. This is quite typical of non-CGO Go binaries. It&rsquo;s all statically linked, meaning all the data that it loads to actually load the binary into memory is only usable by the process itself. This is unlike a typical C program for example, which will almost certainly share a <code>glibc</code> mapping that all other C programs on the system are sharing.</li>
</ul>
<p>With this context in mind, let&rsquo;s go check on <code>otelcol</code>&rsquo;s status file for comparison:</p>





<pre tabindex="0"><code>$ cat /proc/$(pidof otelcol)/status
Name:   otelcol
...
VmPeak:  1384144 kB
VmSize:  1384144 kB
VmLck:         0 kB
VmPin:         0 kB
VmHWM:     83520 kB
VmRSS:     83520 kB
RssAnon:           16656 kB
RssFile:           66864 kB
RssShmem:              0 kB</code></pre><p>Let&rsquo;s compare the relevant rows:</p>
<ul>
<li><code>VmSize</code>: Lower, but is not a super important difference for reasons I stated above.</li>
<li><code>VmRSS</code>, <code>RssAnon</code>, <code>RssFile</code>: Lower, essentially taking up less than half space in RAM that <code>otelcol-contrib</code> is.</li>
<li><code>RssShmem</code>: Also 0 as expected, meaning none of the mappings here can be shared between any processes, and there&rsquo;s no shenanigans like double-counting shared memory between running Collectors or anything; all pages of memory held by each Collector process are completely private.</li>
</ul>
<h3 id="looking-at-memory-map">Looking at Memory Map</h3>
<p>The next step is to look into the direct details of the actual memory mappings of each process. We can get detailed information about all mappings through <code>/proc/[pid]/smaps</code>, but that information is actually a bit more detailed than what we need right now. Instead, we&rsquo;ll use <a href="https://man7.org/linux/man-pages/man1/pmap.1.html"><code>pmap(1)</code></a>, a utility that gives us a simplified view of a process&rsquo; memory map.</p>
<p>Starting with <code>otelcol-contrib</code> again:</p>





<pre tabindex="0"><code>$ pmap -x $(pidof otelcol-contrib)
6962:   ./otelcol-contrib --config config.yaml
Address           Kbytes     RSS   Dirty Mode  Mapping
0000000000400000  155076   72576       0 r-x-- otelcol-contrib
0000000009b71000  182424   79268       0 r---- otelcol-contrib
0000000014d97000    5560    4500     796 rw--- otelcol-contrib
0000000015305000     632     400     400 rw---   [ anon ]
000000c000000000   28672   28672   28672 rw---   [ anon ]
000000c001c00000   36864       0       0 -----   [ anon ]
00007f821bba0000     832     580     580 rw---   [ anon ]
00007f821bc80000      64       4       4 rw---   [ anon ]
00007f821bc94000    3968    3612    3612 rw---   [ anon ]
00007f821c074000    1024       4       4 rw---   [ anon ]
00007f821c174000      72      36      36 rw---   [ anon ]
00007f821c186000   32768       4       4 rw---   [ anon ]
00007f821e186000  263680       0       0 -----   [ anon ]
00007f822e306000       4       4       4 rw---   [ anon ]
00007f822e307000  524284       0       0 -----   [ anon ]
00007f824e306000       4       4       4 rw---   [ anon ]
00007f824e307000  293564       0       0 -----   [ anon ]
00007f82601b6000       4       4       4 rw---   [ anon ]
00007f82601b7000   36692       0       0 -----   [ anon ]
00007f826258c000       4       4       4 rw---   [ anon ]
00007f826258d000    4580       0       0 -----   [ anon ]
00007f8262a06000       4       4       4 rw---   [ anon ]
00007f8262a07000     508       0       0 -----   [ anon ]
00007f8262a86000     384      84      84 rw---   [ anon ]
00007fffabf3d000     132      16      16 rw---   [ stack ]
00007fffabfc5000      16       0       0 r----   [ anon ]
00007fffabfc9000       8       4       0 r-x--   [ anon ]
---------------- ------- ------- -------
total kB         1571824  189780   34228</code></pre><p>The first 3 mappings are standard for basically any ELF binary. These mappings are all named with <code>otelcol-contrib</code>, meaning these are mappings backed by a file called <code>otelcol-contrib</code> which we know for a fact is the binary that this process is running. As we can see, a large portion of the overall RSS of our process is clearly coming from these mappings. I&rsquo;ll discuss what these mappings actually mean in Optional Colour under <a href="#elf-binary-structure">ELF Binary Structure</a>. For now, let&rsquo;s simply take the point that these file-backed mappings take a large amount of the RSS of the overall process.</p>
<p>Let&rsquo;s check this in <code>otelcol</code>:</p>





<pre tabindex="0"><code>$ pmap -x $(pidof otelcol)
8700:   ./otelcol --config config2.yaml
Address           Kbytes     RSS   Dirty Mode  Mapping
0000000000400000   79536   31340       0 r-x-- otelcol
00000000051ac000   73972   36104       0 r---- otelcol
00000000099e9000    2684    1824     376 rw--- otelcol
0000000009c88000     460     252     252 rw---   [ anon ]
000000c000000000   12288   12288   12288 rw---   [ anon ]
000000c000c00000   53248       0       0 -----   [ anon ]
00007f4007636000    4224    3720    3720 rw---   [ anon ]
00007f4007a56000    1024       4       4 rw---   [ anon ]
00007f4007b56000      72      20      20 rw---   [ anon ]
00007f4007b68000   32768       4       4 rw---   [ anon ]
00007f4009b68000  263680       0       0 -----   [ anon ]
00007f4019ce8000       4       4       4 rw---   [ anon ]
00007f4019ce9000  524284       0       0 -----   [ anon ]
00007f4039ce8000       4       4       4 rw---   [ anon ]
00007f4039ce9000  293564       0       0 -----   [ anon ]
00007f404bb98000       4       4       4 rw---   [ anon ]
00007f404bb99000   36692       0       0 -----   [ anon ]
00007f404df6e000       4       4       4 rw---   [ anon ]
00007f404df6f000    4580       0       0 -----   [ anon ]
00007f404e3e8000       4       4       4 rw---   [ anon ]
00007f404e3e9000     508       0       0 -----   [ anon ]
00007f404e468000     384      72      72 rw---   [ anon ]
00007fffbd111000     132      16      16 rw---   [ stack ]
00007fffbd1f3000      16       0       0 r----   [ anon ]
00007fffbd1f7000       8       4       0 r-x--   [ anon ]
---------------- ------- ------- -------
total kB         1384144   85668   16772</code></pre><p>The actual structure of the mappings is almost identical as expected. We can see the file-backed mappings from the binary for the process (<code>otelcol</code> this time). A majority of the RSS is contributed once again by the binary mappings. However, they are much smaller in the <code>otelcol</code> process than the <code>otelcol-contrib</code> process.</p>
<p>This primarily demonstrates that the largest difference between the amount of space these two processes currently resides in RAM is the size of the root Collector binary. However, that exact choice of words is very deliberate.</p>
<h3 id="what-does-rss-really-tell-us">What does RSS really tell us?</h3>
<p>The only piece of information we know for sure when we look at an RSS value is the amount of bytes of RAM this process is currently taking up. Our model of this is slightly simplified due to the fact that our process is not sharing any of its memory with any other processes, whereass if we were we&rsquo;d have to take into account the amount of RSS that our process takes up but shares with others (you can look at things like <a href="https://en.wikipedia.org/wiki/Proportional_set_size">Proportional Set Size</a> which will account for shared memory, but that isn&rsquo;t necessary in this scenario). Overall though, knowing how much memory our process presently takes up in RAM is a really simplistic view of reality.</p>
<p>When a process wants memory, the Kernel will allocate space for it in a unit called a page. On Linux systems this is generally 4096 bytes, or 4KiB (you can check for sure on your system with <code>getconf PAGESIZE</code>). If you&rsquo;re new to this concept and didn&rsquo;t understand what I meant when I used the term &ldquo;page&rdquo; earlier in the post, now you know what I mean. :)<br>
When the kernel allocates a page requested by the process, various things are taken into account such as whether the page is private or shared (in our case, all pages we are concerned with are private), whether the page is backed by some data readable from the disk, whether it is something special like page cache etc. All of this is to determine whether a page can be considered &ldquo;reclaimable&rdquo; by the kernel. User-space processes, drivers, or even kernel operations may be holding a lot of pages in RAM at a given time, and if a system is not under memory pressure, then the process might as well keep the pages in RAM because presumably these pages were useful for somebody at some point. However, once the system <strong>is</strong> under some manner of memory pressure, the kernel will do some work to reclaim the least important pages held in RAM at the moment to satisfy new requests.</p>
<p>I bring all of this up to say that some of the first pages to go in a high pressure reclaim scenario are often file-backed memory mappings. The first consideration is generally any inactive clean pages, but after taking an account of the LRU pages file-backed pages are preferred. An anonymous memory mapping is less preferred in part because if the page is dirty (aka has been written to) it needs to be written to swap-space on disk before the page can be reclaimed, otherwise the data would be lost. A read-only file-backed mapping is always clean and thus doesn&rsquo;t have this restriction, as the data is readily available it can simply be read from disk again if it&rsquo;s needed.
While file-backed mappings are preferred, the reclaim still happens in a least-recently-used manner. Pages that are heavily actively referenced won&rsquo;t be reclaimed as readily as dead pages. The <code>otelcol(-contrib)</code> file backed pages will be considerably active since they are constantly read for the process&rsquo;s operation, but it doesn&rsquo;t mean they aren&rsquo;t reclaimable when push comes to shove. So while at time of checking we found that the first mapping of <code>otelcol-contrib</code> is taking 72576 bytes of RSS, that doesn&rsquo;t mean many of its pages won&rsquo;t be reclaimed in a memory pressure scenario.</p>
<p>That means that just looking at RSS doesn&rsquo;t always paint a full picture of what matters in our process&rsquo;s memory usage. The OOM (Out Of Memory) Killer is one of the biggest things you want to avoid, but a high RSS doesn&rsquo;t necessarily mean you need to fear the OOM Killer yet; the OOM Killer will kill a process specifically when it is <strong>unable to reclaim enough space to fill a memory request</strong>. That is to say that it will first do everything it can to reclaim enough pages of memory to satisfy the new allocation request before OOM Killing a process.</p>
<p>Since I imagine the primary audience here is observability-minded folks, so let&rsquo;s tie this back to observability for those who haven&rsquo;t long since dropped off the article. Obviously the fresh VM I used for this experiment is experiencing essentially no memory pressure and can very easily satisfy memory requests for the foreseeable future, but even if we were getting tight, I might be getting worried about my collector by looking at the higher RSS value. If I&rsquo;m looking at the memory usage of my system, and RSS as my primary per-process memory metric, I might consider the Collector to be contributing to some significant portion of that memory pressure at a glance. But RSS is a cumulative measurement that often measures a lot of reclaimable pages. The RSS itself could still be bad; since it&rsquo;s also measuring dirty anonymous memory mappings, a consistently rising RSS can still indicate a memory leak in the actual program, and a process with a lot of RSS in a high memory pressure system may be not at risk of being killed but still at risk of causing heavy <a href="https://en.wikipedia.org/wiki/Thrashing_(computer_science)">page thrashing</a> for the system. However we can&rsquo;t actually grasp the true nature of a given process&rsquo; memory usage and what effect it has on our whole system just by its RSS value, and in the case of my contrived example the high RSS is not such a big risk because we know how much of RSS is presently taken up by file-backed pages.</p>
<h3 id="note-on-cgroups">Note on cgroups</h3>
<p>The Collector is often not running as a standalone process like this. Typically it will be running under a <code>cgroup</code>, either as a <code>systemd</code> service or as a container image. A <code>cgroup</code> itself can have a local memory limit. The page reclaim behaviour that I explained in the previous section when the entire system is under memory pressure also applies to when a cgroup is locally hitting its memory limit. However the page reclaim doesn&rsquo;t occur on the whole system, only locally on the pages owned by the cgroup.</p>
<h2 id="conclusion">Conclusion</h2>
<p>It still helps to keep Go binary sizes down. Using more RSS can still cause some problems. But the impact that the larger binary actually has on our practical system operation is not as bad as it looks.</p>
<p>From an OpenTelemetry perspective, as a user of the Collector looking at this you might be wondering whether this means you&rsquo;re safe to use contrib or if you should still pursue building a custom collector. There remain great reasons not to use contrib:</p>
<ul>
<li>Larger binary/image takes more disk space</li>
<li>More components = more dependencies = more threat space for vulnerability scanners to yell at you :)</li>
<li>There can be some negative effects to the RSS overhead of the larger binary</li>
</ul>
<p>But overall, even despite some of my previous public statements that the increased memory overhead of contrib is really bad, after looking much deeper into it and understanding Linux kernel memory management more, I understand that it is not as severe as it looks and as I may have previously made it out to be.</p>
<hr>
<h2 id="optional-colour">Optional Colour</h2>
<p>Branching explanations of things that didn&rsquo;t fit nicely into the overall investigation.</p>
<h3 id="elf-binary-structure">ELF Binary Structure</h3>
<p>We can understand more information about those initial binary mappings in the <code>pmap</code> output by understanding a bit more about the ELF (Executable and Linkable Format) binary format. The <a href="https://man7.org/linux/man-pages/man1/readelf.1.html"><code>readelf(1)</code></a> tool can give us a nice readable output that can help us understand more clearly. From that output, let&rsquo;s look at the two most relevant sections when running <code>readelf -a otelcol-contrib</code></p>





<pre tabindex="0"><code>Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  PHDR           0x0000000000000040 0x0000000000400040 0x0000000000400040
                 0x0000000000000150 0x0000000000000150  R      0x1000
  NOTE           0x0000000000000f78 0x0000000000400f78 0x0000000000400f78
                 0x0000000000000064 0x0000000000000064  R      0x4
  LOAD           0x0000000000000000 0x0000000000400000 0x0000000000400000
                 0x0000000009770691 0x0000000009770691  R E    0x1000
  LOAD           0x0000000009771000 0x0000000009b71000 0x0000000009b71000
                 0x000000000b225688 0x000000000b225688  R      0x1000
  LOAD           0x0000000014997000 0x0000000014d97000 0x0000000014d97000
                 0x000000000056df80 0x000000000060b3c0  RW     0x1000
  GNU_STACK      0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x0000000000000000 0x0000000000000000  RW     0x8

 Section to Segment mapping:
  Segment Sections...
   00
   01     .note.go.buildid
   02     .text .note.gnu.build-id .note.go.buildid
   03     .rodata .typelink .itablink .gosymtab .gopclntab
   04     .go.buildinfo .go.fipsinfo .noptrdata .data .bss .noptrbss
   05

There is no dynamic section in this file.

There are no relocations in this file.</code></pre><p>The crucial information here are the 3 <code>LOAD</code> sections, and the respective Segment Sections <code>02</code>, <code>03</code>, and <code>04</code> that gives information about what sections from the binary end up in each segment. The <code>LOAD</code> sections are what the process will load into memory. You can see the <code>PhysAddr</code> values for each of these mappings are exactly the same as the mappings from the <code>pmap</code> output. The Section to Segment mappings then tell us what exactly is in each of those sections. In <code>02</code>, the <code>0x400000</code> segment, we can see the <code>.text</code> section. This is where the actual CPU instructions are, hence it living in a section with the <code>R E</code> (Read and Execute) flags that get mapped to <code>r-x</code> mode when mapped by the process. In <code>03</code>, the <code>0x9b71000</code> segment, it has the <code>.rodata</code> (read-only data) section which is where all constant data ends up. It also contains a couple interesting Go-specific sections, like the <code>.gopclntab</code> which I&rsquo;ll discuss shortly. In <code>04</code>, the <code>0x14d97000</code> segment, we have the <code>.data</code> and <code>.bss</code> sections. The <code>.data</code> section contains non-constant data that is known at compile-time, i.e. something like <code>var x int = 0</code> at the top of a file would make it to this section because it&rsquo;s initialized and thus known compile-time. The <code>.bss</code> section is where known uninitialized data will go, such as <code>var x int</code> without being set. These go in the writeable section because the program may end up changing the values of data from this segment.</p>
<p>Further elaborating on the <code>.gopclntab</code>, this is the Program Counter Line Table. This is a table of program counter values (program counter being an address of an instruction the program could be running) to source information from the original Go code used to compile the binary. Have you ever wondered how even a binary stripped of symbols via <code>ldflags=&quot;-s -w&quot;</code> still produces a backtrace to actual source locations when <code>panic</code>ing? It&rsquo;s because no matter what, this section of the binary is always kept intact. The runtime uses it for various things, but the <code>panic</code> backtrace is the most obvious one. You can see some more discussion about this in <a href="https://github.com/golang/go/issues/36555">golang/go#36555</a> where the option to not write symbols to the pclntab was requested and rejected.</p>
<p>The two most obvious ways that a Go binary grows when adding more dependencies is partially that more code = more instructions, and partially that more code = more symbols to write to the pclntab. This means that a collector with more components grows substantially in both of these sections.</p>
<h3 id="pprof-heap-analysis"><code>pprof</code> Heap Analysis</h3>
<p>The direct heap analysis reveals some interesting things, but nothing all that substantially exciting.</p>
<p>Looking at the <code>heap</code> profile for <code>otelcol</code> first:</p>





<pre tabindex="0"><code>$ go tool pprof http://localhost:1778/debug/pprof/heap
Fetching profile over HTTP from http://localhost:1778/debug/pprof/heap
Saved profile in /home/braydonk_google_com/pprof/pprof.otelcol.alloc_objects.alloc_space.inuse_objects.inuse_space.002.pb.gz
File: otelcol
Build ID: 785743821d9ac664638f9342e4270b2bdbcef397
Type: inuse_space
Time: 2026-02-06 03:14:43 UTC
Entering interactive mode (type &#34;help&#34; for commands, &#34;o&#34; for options)
(pprof) top
Showing nodes accounting for 4121.87kB, 100% of 4121.87kB total
Showing top 10 nodes out of 34
      flat  flat%   sum%        cum   cum%
 1536.51kB 37.28% 37.28%  1536.51kB 37.28%  go.uber.org/zap/zapcore.newCounters (inline)
 1024.28kB 24.85% 62.13%  1024.28kB 24.85%  k8s.io/api/core/v1.init
  532.26kB 12.91% 75.04%   532.26kB 12.91%  github.com/xdg-go/stringprep.map.init.2
  516.76kB 12.54% 87.58%   516.76kB 12.54%  runtime.procresize
  512.05kB 12.42%   100%   512.05kB 12.42%  runtime.(*scavengerState).init
         0     0%   100%  1536.51kB 37.28%  github.com/spf13/cobra.(*Command).Execute
         0     0%   100%  1536.51kB 37.28%  github.com/spf13/cobra.(*Command).ExecuteC
         0     0%   100%  1536.51kB 37.28%  github.com/spf13/cobra.(*Command).execute
         0     0%   100%   532.26kB 12.91%  github.com/xdg-go/stringprep.init
         0     0%   100%   768.26kB 18.64%  go.opentelemetry.io/collector/exporter.(*factory).CreateLogs</code></pre><p>As we can see, the heap profile is only accounting for 4kB of <code>inuse_space</code>. This is because, as the rest of the article explains, the Go heap itself is only a small proportion of the RSS of the program. In here the top node is from <code>zapcore</code>, likely the result of different components setting up zap loggers, as well as things from package <code>init</code>, which is allocation that happens as a result of the package being <code>import</code>ed at all (either allocations that happen in global variables, or in explicit <code>func init()</code>s).</p>
<p>We did see through our analysis that <code>otelcol-contrib</code> uses more <code>RssAnon</code>, so it might have more heap space allocated. Is that the case?</p>





<pre tabindex="0"><code>$ go tool pprof http://localhost:1777/debug/pprof/heap
Fetching profile over HTTP from http://localhost:1777/debug/pprof/heap
Saved profile in /home/braydonk_google_com/pprof/pprof.otelcol-contrib.alloc_objects.alloc_space.inuse_objects.inuse_space.002.pb.gz
File: otelcol-contrib
Build ID: 3cd07bf5097963927ede2748cc4538b224f3906a
Type: inuse_space
Time: 2026-02-06 03:22:14 UTC
Entering interactive mode (type &#34;help&#34; for commands, &#34;o&#34; for options)
(pprof) top
Showing nodes accounting for 11801.29kB, 63.79% of 18501.38kB total
Showing top 10 nodes out of 143
      flat  flat%   sum%        cum   cum%
 2609.95kB 14.11% 14.11%  2609.95kB 14.11%  regexp/syntax.(*compiler).inst
 2561.41kB 13.84% 27.95%  2561.41kB 13.84%  github.com/aws/aws-sdk-go/aws/endpoints.init
 1064.52kB  5.75% 33.70%  1064.52kB  5.75%  google.golang.org/protobuf/reflect/protoregistry.(*Files).RegisterFile.func2
    1027kB  5.55% 39.26%     1027kB  5.55%  google.golang.org/protobuf/internal/filedesc.(*File).initDecls (inline)
 1025.88kB  5.54% 44.80%  1025.88kB  5.54%  regexp.onePassCopy
 1024.47kB  5.54% 50.34%  2051.47kB 11.09%  google.golang.org/protobuf/internal/filedesc.newRawFile
  768.26kB  4.15% 54.49%   768.26kB  4.15%  go.uber.org/zap/zapcore.newCounters
  655.29kB  3.54% 58.03%   655.29kB  3.54%  runtime.itabsinit
  532.26kB  2.88% 60.91%   532.26kB  2.88%  github.com/DataDog/viper.(*Viper).SetKnown
  532.26kB  2.88% 63.79%   532.26kB  2.88%  github.com/vmware/govmomi/vim25/types.Add</code></pre><p>It is! Around 14kB more in total. I can&rsquo;t exactly pretend that difference is anything to get <strong>that</strong> worked up about though. However it is interesting to note that the difference is a result of us having more dependencies in our binary. I&rsquo;d have to dig further into the profile to see which package is precompiling regexes that isn&rsquo;t present in <code>otelcol</code>, but the <code>aws/endpoints.init</code> is pretty dead obvious that the AWS SDK that backs the AWS exporters has some allocations either in a <code>func init</code> or in global variables. You can also see <code>DataDog</code>, <code>vmware</code>, and more represented in this snippet.</p>
<p>But hey, we didn&rsquo;t configure any of those components in our <a href="https://gist.github.com/braydonk/fdab97351da89780f01b672c5248ef86">test config</a>, why are these package inits happening? We&rsquo;re not gonna use the AWS stuff, so those allocations are a total waste for us. When the Collector starts up, it takes a registry of all components it was built with and needs to initialize all the component factories so that they can be instantiated in the case that they are configured. This means there&rsquo;s no avoiding the fact that these packages are unfortunately imported on Collector startup whether we asked for the relevant components or not.</p>
<hr>
<h3 id="sources">Sources</h3>
<p>Following are the most important resources that I either pulled information from directly for this article, or that more deeply explain things I went over here:</p>
<ul>
<li>Official Kernel docs on memory management, most of the page reclaim info came from here: <a href="https://docs.kernel.org/admin-guide/mm/concepts.html">https://docs.kernel.org/admin-guide/mm/concepts.html</a></li>
<li>The Linux Memory Manager by Lorenzo Stoakes, which I have been reading an early access copy of and loving: <a href="https://nostarch.com/linux-memory-manager">https://nostarch.com/linux-memory-manager</a></li>
<li>I linked it in-place, but repeated here for posterity the Wikipedia page on Thrashing explains the concept pretty well: <a href="https://en.wikipedia.org/wiki/Thrashing_(computer_science)">https://en.wikipedia.org/wiki/Thrashing_(computer_science)</a></li>
<li><code>mmap(2)</code> man page which explains the different parameters that can be used when mapping a page: <a href="https://man7.org/linux/man-pages/man2/mmap.2.html">https://man7.org/linux/man-pages/man2/mmap.2.html</a></li>
<li><code>proc_pid_status(5)</code> man page which explains what <code>/proc/[pid]/status</code> fields mean: <a href="https://man7.org/linux/man-pages/man5/proc_pid_status.5.html">https://man7.org/linux/man-pages/man5/proc_pid_status.5.html</a></li>
<li>Other man pages referenced are linked in the spots I bring them up</li>
<li>I sort of cut it out of the article, but this issue was great reading to see more about Go&rsquo;s heap allocation strategy. And it made me learn a fun new feature of Go 1.26, which is that the <code>0xc000000000</code> special address where the heap always starts today can be randomized through an experiment. Neat! <a href="https://github.com/golang/go/issues/27583">https://github.com/golang/go/issues/27583</a></li>
</ul>
]]></content:encoded>
    </item>
    <item>
      <title>February 2026</title>
      <link>https://blog.ragecage64.com/journal/february-2026/</link>
      <pubDate>Wed, 04 Feb 2026 00:00:00 +0000</pubDate>
      <guid>https://blog.ragecage64.com/journal/february-2026/</guid>
      <description>&lt;h1 id=&#34;feb-4-2026&#34;&gt;Feb 4, 2026&lt;/h1&gt;&#xA;&lt;p&gt;Starting my journal yippee. :)&lt;br&gt;&#xA;My biggest personal shortcoming that I want to overcome this year is putting my thoughts down in writing. Some people I work with might find this a bit silly given how much I write down when I can actually get myself to do it. My problem is how hard it is to get started writing a proper piece, be it doc or blog post; I get so caught up in all the details and all the things I wanna say, and say as accurately and correctly as possible, that I end up dragging my feet big time getting start. So in this journal I intend to track the different random stuff I&amp;rsquo;m doing and tech thoughts I&amp;rsquo;m having throughout the day. I&amp;rsquo;m hoping that keeping the pressure low helps me build a habit of writing.&lt;/p&gt;</description>
      <content:encoded><![CDATA[<h1 id="feb-4-2026">Feb 4, 2026</h1>
<p>Starting my journal yippee. :)<br>
My biggest personal shortcoming that I want to overcome this year is putting my thoughts down in writing. Some people I work with might find this a bit silly given how much I write down when I can actually get myself to do it. My problem is how hard it is to get started writing a proper piece, be it doc or blog post; I get so caught up in all the details and all the things I wanna say, and say as accurately and correctly as possible, that I end up dragging my feet big time getting start. So in this journal I intend to track the different random stuff I&rsquo;m doing and tech thoughts I&rsquo;m having throughout the day. I&rsquo;m hoping that keeping the pressure low helps me build a habit of writing.</p>
<p>The theme of today is definitely that migration is hard. It always surprises me how often the things we treat as black boxes are not so easy in-and-out as we&rsquo;d like to think. I&rsquo;d love to be able to think of telemetry collection as a Unix-esque utility, where data comes in, we do something to the data, and then we send it somewhere else. Thinking of it in this simplistic way does help simplify learning for those new to observability, but as an observability implementer and product creator, we simply can&rsquo;t afford to model it like this. There are so many tiny details in observability instrumentation that can get lost. A big one is the exact shape of internal telemetry data; you&rsquo;d like to think you&rsquo;d have some standard golden set of metrics you can always get and expect everywhere, but even an agreed Semantic Convention isn&rsquo;t enough to ensure that any possible observability solution you use is going to be instrumenting it exactly as expected.<br>
Case in point, I was looking with a coworker at some of the Collector self metric instrumentation. Specifically trying to track down how GRPC metrics are instrumented in each component to try and determine why the googlecloud exporter has some slightly different looking GRPC metrics than the otlpexporter. They both call into <code>otelgrpc</code> and get dialoptions from it, though using a slightly different API to do so. Doesn&rsquo;t explain why we get <code>grpc.*</code> metrics from the googlecloud exporter while the otlp exporter makes <code>rpc.*</code> metrics happen. Couldn&rsquo;t come up with a conclusive answer to this since I don&rsquo;t really deeply understand the <code>configgrpc</code> package or the <code>otelgrpc</code> package. But it feeds into my above thoughts because the reason we have to dig into the code to figure this stuff out is that we have to derive an API Request Count metric in a way that matches user expectations for the Ops Agent when we start to move to OTLP export. Sure, the two exporters under the knife use the same package to instrument, but even then we can&rsquo;t come up with a 1:1 translation for the thing we need. We&rsquo;re going to need to do some other tricks to derive the same answer. This isn&rsquo;t even to mention the cases where we can&rsquo;t get an equivalent metric, or a metric is recorded incorrectly and thus not produced properly by the exporter we&rsquo;re moving to. Migrations in observability tooling is a painful experience, and in the current landscape requires migrators to have a pretty deep understanding of what it is their looking at and what the data all means.</p>
<p>I addressed a Collector Contrib issue about the <code>system.processes.created</code> metric in the hostmetrics receiver, which I didn&rsquo;t realize that we only produce on Linux and BSD. That decision was before my time, but I assume it was done based on their state of implementation in <code>gopsutil</code>. <code>gopsutil</code> likely only gets that info for these two platforms because only these two have nice APIs to get this info (<code>/proc/stat</code> fork count in Linux, sysctl <code>kern.forkstat</code> in OpenBSD). You can technically get this info in Windows and MacOS if you wanted to do some super cringe stuff, but not things we&rsquo;re willing to implement in hostmetrics. <a href="https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/45721">https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/45721</a></p>
<p>I started today generally distracted as I remembered that we need to make the OS packaging for Google-Built OpenTelemetry Collector link to BoringCrypto. Goreleaser is really powerful but not the easiest thing in the world to use; getting a template that worked right took a bit of effort. I&rsquo;ve never been a huge fan of their docs which is just a giant yaml file with all available fields and a bunch of comments. I&rsquo;m just kind of hunting and pecking around, having an easier time just letting Gemini link me out to the documentation it can find better than normal Google searches lol<br>
Getting the config working for BoringCrypto linking wasn&rsquo;t too bad once I figured out the config quirks, but doing cross-platform compilation is a huge pain; in normal the standard <code>distrogen</code> generated container for cross-platform builds, this was a bit simpler because we could leverage Docker cross-building capabilities to orchestrate it. But <code>goreleaser</code> needs to build for all platforms in one go. This means the container will need to always install all cross-compilation toolchain stuff, and the goreleaser config will need to assume it&rsquo;s running on debian with all the cross-compilation toolchains set up. Kind of annoying but not the end of the world, since the expectation is typically going to be that users do the goreleaser releases from a containerized environment anyway.<br>
In the end, I managed to get BoringCrypto working with proper cross-compilation to at least arm64. What a pain, but luckily I got it working and hopefully shouldn&rsquo;t need to look at it again any time soon&hellip; <a href="https://github.com/GoogleCloudPlatform/opentelemetry-operations-collector/pull/490">https://github.com/GoogleCloudPlatform/opentelemetry-operations-collector/pull/490</a></p>
<p>I saw another AI slop PR in Collector Contrib today. My reaction to obvious LLM-proxy syncophant comments is getting more and more visceral each time. If you&rsquo;re reading this and are thinking of submitting an LLM-powered PR&hellip; Think about something else. :)</p>
<p><strong>Song of the day</strong>: A.M War by Karnivool<br>
I have been listening to a lot of Karnivool since the new album comes out this Friday! First album in 13 years!!!</p>
<h1 id="feb-5-2026">Feb 5, 2026</h1>
<p>Had a horrendous sleep last night, let&rsquo;s see what I can reasonably accomplish today. If I could, I&rsquo;d pick some lower brainpower tasks, but I&rsquo;ve got too much to do&hellip;</p>
<p>I worked a bit on <code>distrogen</code> documentation. I have a large wave of new features to add, and I&rsquo;m excited to be able to share the tool with new OTel community members whilst actually having usage documentation to go with it.</p>
<p>I don&rsquo;t have a lot else to say journal-wise today because of just how hard I worked on the Go Binary Size blog post. It was my hardest post yet to write, but I&rsquo;m really proud of it.</p>
<p><strong>Song of the day</strong>: Break Those Bones Whose Sinews Gave It Motion by Meshuggah<br>
I focus well to music that is incredibly aggressive, catchy, and groovy. This song is all 3 to me. Maybe not melodically catchy, but man that rhythm hooks me&hellip;</p>
<h1 id="feb-6-2026">Feb 6, 2026</h1>
<p>I slept a little better last night at least.</p>
<p>Today I have a lot of writing to do. I have to write a handoff for the GBOC packaging project, I have to finish the distrogen docs, and I&rsquo;m writing a small piece about Collector plugin loading at runtime. So it will be a sparse journal entry once again likely.</p>
<p>Everyone is all excited today about the fact that Claude agents were able to write a C compiler. Cool I guess? It feels like pure hype-bait to me, except worse than the web browser because this time it was something that &ldquo;worked&rdquo;; that web browser was an obvious piece of garbage, but this compiles C code wow! But I&rsquo;m really not shocked at all that an AI could write a C compiler tbh. The language is very heavily specified in text, and there are lots of C compilers that the AI is likely already trained on. Building something that can successfully compile C code isn&rsquo;t exactly rocket science. Building something that can compile C code with all of the immense optimization work that goes into popular compilers like gcc is where the rocket science really happens.</p>
<p>Yesterday we talked about issues with double-writing old and new schemas when receivers are transition to semconv. The reconciliation is quite difficult if the two schemas have a a metric with the same name but other details about the metric changing. We came up with a decent reconciliation pattern here, but it might be confusing for new users in multiple ways. <a href="https://github.com/open-telemetry/opentelemetry-collector/pull/14538#discussion_r2775609768">https://github.com/open-telemetry/opentelemetry-collector/pull/14538#discussion_r2775609768</a></p>
<p><strong>Song of the day</strong>: The entire IN VERSES album by Karnivool<br>
The new album came out, and dare I say it was worth the 13 year wait. This album closer is transcendental.</p>
<h1 id="feb-7-2026">Feb 7, 2026</h1>
<p>I&rsquo;m doing a bit of writing today, but will be busy with a family event for the rest of the day so not much to say!</p>
<p><strong>Song of the day</strong>: My Pantheon (Forevermore) by Kamelot<br>
They are kind of a guilty pleasure band for me. Common perception appears to be that they are corny and formulaic. But for me that corn is buttered and salted, and it&rsquo;s a damn good formula.</p>
]]></content:encoded>
    </item>
    <item>
      <title>Dynamic linking madness: solving a bug in go-nvml</title>
      <link>https://blog.ragecage64.com/blog/dynamic-linking-madness/</link>
      <pubDate>Sat, 15 Feb 2025 00:00:00 +0000</pubDate>
      <guid>https://blog.ragecage64.com/blog/dynamic-linking-madness/</guid>
      <description>&lt;p&gt;I work on open source observability software, primarily the &lt;a href=&#34;https://github.com/GoogleCloudPlatform/ops-agent&#34;&gt;Google Cloud Ops Agent&lt;/a&gt;, &lt;a href=&#34;https://opentelemetry.io/docs/collector/&#34;&gt;OpenTelemetry Collector&lt;/a&gt;, and &lt;a href=&#34;https://github.com/fluent/fluent-bit&#34;&gt;Fluent Bit&lt;/a&gt;.&lt;br&gt;&#xA;Over the past few years, I have gained an affinity for taking on the types of deep issues that have me journeying as deep into the weeds as I can get. In this post I&amp;rsquo;m going to go over one of those issues, perhaps partially to self-document everything I learned but also because I think it was an interesting journey worth writing down.&lt;/p&gt;</description>
      <content:encoded><![CDATA[<p>I work on open source observability software, primarily the <a href="https://github.com/GoogleCloudPlatform/ops-agent">Google Cloud Ops Agent</a>, <a href="https://opentelemetry.io/docs/collector/">OpenTelemetry Collector</a>, and <a href="https://github.com/fluent/fluent-bit">Fluent Bit</a>.<br>
Over the past few years, I have gained an affinity for taking on the types of deep issues that have me journeying as deep into the weeds as I can get. In this post I&rsquo;m going to go over one of those issues, perhaps partially to self-document everything I learned but also because I think it was an interesting journey worth writing down.</p>
<h2 id="the-issue-go-nvml-crashes-our-opentelemetry-collector">The Issue: go-nvml crashes our OpenTelemetry Collector</h2>
<p>One of the features of the Ops Agent is GPU Monitoring; if you install the Ops Agent on a GCE VM with a GPU, you will automatically get metrics for it through the <a href="https://developer.nvidia.com/nvidia-management-library-nvml">NVIDIA Management Library (NVML)</a>, and optionally through <a href="https://developer.nvidia.com/dcgm">DCGM</a>. To achieve this, we built specific instrumentation using the <a href="https://github.com/NVIDIA/go-nvml">Go bindings for NVML</a> and for DCGM.</p>
<p>We learned when attempting to upgrade our build of the Collector to Go 1.21 that the Collector would crash on startup if a GPU was present on the machine. It produced the kind of panic you wouldn&rsquo;t usually be used to seeing in a Go program:</p>





<pre tabindex="0"><code>SIGSEGV: segmentation violation
PC=0x0 m=0 sigcode=1
signal arrived during cgo execution</code></pre><p>Seeing <code>PC=0x0</code> was very surprising to me. I had no idea how this sort of thing could occur in a Go program, even with CGO. Even more strange was that this crash was only happening on certain systems. How could something like a segfault be system dependent?<br>
I was absolutely hooked. I would not rest until I understood why this could possibly be happening.</p>
<p>You can read <a href="https://github.com/NVIDIA/go-nvml/issues/36">the original issue in go-nvml</a> and <a href="https://github.com/golang/go/issues/63264">the issue I opened in golang/go</a> to see the real discussions, or read on for my direct retelling.</p>
<h2 id="intro-to-dynamic-libraries">Intro to dynamic libraries</h2>
<p>This is information that I feel is important to understand the underlying issue. If you are already familiar with how dynamic libraries are loaded, you can skip to <a href="#how-go-nvml-works">How go-nvml works</a>.</p>
<h3 id="dynamic-vs-static-linking">Dynamic vs Static Linking</h3>
<p>In C and adjacent languages, there are two ways to link a library to your application: static, and dynamic. Static linking is pretty straightforward; the library code is included at compile-time, and when the library is compiled into an object, it is then linked directly into the resulting binary. When the compiled program is run and something from the library is referenced, the implementation is already present within the binary. With dynamic linking, rather than the libraries being built directly into the binary, the libraries are simply referenced by the application to then be loaded at runtime. These will be <code>.so</code> on Linux or <code>.dll</code> on Windows. When the application is run, the operating system receives instructions to look for the libraries on the system, and if they are found they are loaded for the program to use, or if not found the program fails to start.</p>
<p>Static linking sure does sound great, right? There&rsquo;s not much to think about there, the code is just included in the binary rather than needing to worry about having specific dynamic libraries on the system. Why wouldn&rsquo;t you always do that? Golang agrees with you; all binaries built with pure Go are completely statically linked. This is actually a selling point of the language, and as an avid user of it I can feel the benefits. It is so nice to build a giant Go program, and just have one nice clean binary at the end with everything the binary needs. As someone working on a <a href="https://github.com/google/yamlfmt">tool written in Go</a>, I love that building and distributing it is so dead simple because it&rsquo;s one statically linked binary. No separate instructions that certain libraries have to be <code>apt install</code>ed onto the system, or being forced to distribute a container image for the tool to be usable.</p>
<p>Dynamic linking does have a purpose though, especially when writing lower level applications. One of the most popular ones is C runtime libraries, an implementation of which is available on any Linux distribution, or can be installed on Windows through the <code>Visual C++ Redistributable</code> (something I&rsquo;m sure many gamers have installed and not really known why). C runtimes can be statically linked in most compilers, however it often doesn&rsquo;t make much sense to statically link something that is available on most any system the application will run on. One of the biggest reasons is binary sizes. I&rsquo;ve seen people online be quite confused at the size of a simple Go Hello World program exceeding a megabyte (at least at the time), but the reason for this is that Go does indeed statically link its runtime with the binary which baloons the size of the binary.</p>
<p>Large binaries with lots of static linked libraries has other complications as well, such as the amount of memory the program can take to run. I&rsquo;d like to write a separate blog post about this at some point, but in short, large statically linked binaries can take more memory to run because loading the binary instructions and data in the first place takes up more space in RAM. The difference with dynamically loading libraries is that the memory the libary takes up in memory can be shared by any other processes using the library. So if we just take dynamically linking <code>libc</code> as an example, there are probably tons of other applications on the system also dynamically loading libc and all sharing that memory in RAM. If all those same binaries had statically linked <code>libc</code>, then they would each have a private copy of <code>libc</code> with all the space in memory that would take up and would be unable to share with any other processes on the system.</p>
<h3 id="dynamic-loading">Dynamic Loading</h3>
<p>The other way to interact with dynamic libraries is by loading them explicitly. With dynamic linking, the required libraries are built into the binary for the system to discover when the program is loaded. However, sometimes the exact library to be used can&rsquo;t be known at compile time. There may be multiple versions of the library that the program is built to work with, and there needs to be some logic done at runtime to determine exactly which library is loaded. This is common with versioned APIs, where there may be <code>v2</code> versions of functions present in dynamic libraries (rather than just reimplementing the functions so that backwards compatibility can be maintained, which is really important for dynamic libraries).<br>
So the alternative method is loading the libraries at runtime using <code>dlopen</code> in Linux, or <code>LoadLibrary</code> in Windows. This gives you a handle to the libary loaded into program memory, and to find symbols in it you can look them up in the loaded library using <code>dlsym</code> in Linux or <code>GetProcAddress</code> in Windows.</p>
<h3 id="exporting-dynamic-symbols-linux-elf-binaries">Exporting Dynamic Symbols (Linux ELF binaries)</h3>
<p>We have now exceeded my knowledge of how this might work in Windows, so this section is specific to ELF binaries on Linux.</p>
<p>What typically happens in the linking step is the linker maintains all external references to dynamic symbols in two sections of the binary called the PLT (Procedure Linkage Table) and the GOT (Global Offset Table). The PLT maintains references to all dynamic symbols used, while the GOT maintains the actual address of known dynamic symbols. Upon usage of a dynamic symbol, the compiler references the PLT entry for that symbol. At the linking stage, the linker will add those known symbols to the GOT. At runtime, when a PLT entry is called, it will look for an entry in the GOT and jump to that address, otherwise it willtry to resolve the symbol manually.</p>
<p>Let&rsquo;s see this in action with a very simple C program:</p>





<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-c" data-lang="c"><span style="display:flex;"><span><span style="color:#75715e">#include</span> <span style="color:#75715e">&lt;stdio.h&gt;</span><span style="color:#75715e">
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">int</span> <span style="color:#a6e22e">main</span>() {
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">printf</span>(<span style="color:#e6db74">&#34;hi</span><span style="color:#ae81ff">\n</span><span style="color:#e6db74">&#34;</span>);
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">return</span> <span style="color:#ae81ff">0</span>;
</span></span><span style="display:flex;"><span>}</span></span></code></pre></div><p>I&rsquo;ll compile the binary with <code>gcc</code> and immediately disassemble it:</p>





<pre tabindex="0"><code>$ make
gcc -o hello -g -Wall main.c
$ objdump -d hello &gt; hello.s</code></pre><p>Let&rsquo;s navigate the dump to the <code>main</code> subroutine:</p>





<pre tabindex="0"><code>0000000000001149 &lt;main&gt;:
    1149:	f3 0f 1e fa          	endbr64
    114d:	55                   	push   %rbp
    114e:	48 89 e5             	mov    %rsp,%rbp
    1151:	48 8d 3d ac 0e 00 00 	lea    0xeac(%rip),%rdi        # 2004 &lt;_IO_stdin_used+0x4&gt;
    1158:	e8 f3 fe ff ff       	call   1050 &lt;puts@plt&gt;
    115d:	b8 00 00 00 00       	mov    $0x0,%eax
    1162:	5d                   	pop    %rbp
    1163:	c3                   	ret
    1164:	66 2e 0f 1f 84 00 00 	cs nopw 0x0(%rax,%rax,1)
    116b:	00 00 00 
    116e:	66 90                	xchg   %ax,%ax</code></pre><p>What we care about here is instruction <code>1158</code>, with the call to <code>puts@plt</code>. This is a reference to a symbol <code>puts</code> in the PLT, which is a result of us calling <code>printf</code> from <code>stdio.h</code> in our program.</p>
<p>In the dump we can also analyze the disassembly of the <code>plt</code>:</p>





<pre tabindex="0"><code>Disassembly of section .plt:

0000000000001020 &lt;.plt&gt;:
    1020:	ff 35 9a 2f 00 00    	push   0x2f9a(%rip)        # 3fc0 &lt;_GLOBAL_OFFSET_TABLE_+0x8&gt;
    1026:	ff 25 9c 2f 00 00    	jmp    *0x2f9c(%rip)        # 3fc8 &lt;_GLOBAL_OFFSET_TABLE_+0x10&gt;
    102c:	0f 1f 40 00          	nopl   0x0(%rax)
    1030:	f3 0f 1e fa          	endbr64
    1034:	68 00 00 00 00       	push   $0x0
    1039:	e9 e2 ff ff ff       	jmp    1020 &lt;_init+0x20&gt;
    103e:	66 90                	xchg   %ax,%ax

Disassembly of section .plt.got:

0000000000001040 &lt;__cxa_finalize@plt&gt;:
    1040:	f3 0f 1e fa          	endbr64
    1044:	ff 25 ae 2f 00 00    	jmp    *0x2fae(%rip)        # 3ff8 &lt;__cxa_finalize@GLIBC_2.2.5&gt;
    104a:	66 0f 1f 44 00 00    	nopw   0x0(%rax,%rax,1)

Disassembly of section .plt.sec:

0000000000001050 &lt;puts@plt&gt;:
    1050:	f3 0f 1e fa          	endbr64
    1054:	ff 25 76 2f 00 00    	jmp    *0x2f76(%rip)        # 3fd0 &lt;puts@GLIBC_2.2.5&gt;
    105a:	66 0f 1f 44 00 00    	nopw   0x0(%rax,%rax,1)</code></pre><p>We can see that <code>puts@plt</code> ends up doing a jump to address <code>0x2f76</code>, the location of that symbol from <code>GLIBC_2.2.5</code>.</p>
<p>All of this will be important when we get to the bug itself, so I hope you stayed awake!</p>
<h2 id="how-go-nvml-works">How go-nvml works</h2>
<p>The Go NVML bindings are an interesting challenge. NVML is a closed source library, and the intended usage is to link to the shared object on the system using a public header. So the way the Go NVML bindings work is as follows:</p>
<ol>
<li>Provide a copy of the <a href="https://github.com/NVIDIA/go-nvml/blob/main/pkg/nvml/nvml.h">NVML header</a></li>
<li>Using a 3rd party tool called <a href="https://c.for-go.com/">c-for-go</a> generate a set of Go bindings</li>
<li>Wrap the Go bindings in a light API layer for user friendliness</li>
</ol>
<p>The function that was segfaulting was actually the first function, <code>nvmlInit</code>. So let&rsquo;s look at the process of loading this function:</p>
<ol>
<li>The library <code>libnvidia-ml.so.1</code> is loaded using <code>dlopen</code> with the flags <code>RTLD_LAZY | RTLD_GLOBAL</code>.</li>
<li>Much of the API is versioned in the library, so each of the versioned APIs are search in the loaded library using <code>dlsym</code>. If the v2 version of a symbol is present, then the bindings are told to use the v2 version of the symbol. In our case, we are using an NVML library that&rsquo;s new enough to have <code>nvmlInit_v2</code>, so we will end up using that symbol.</li>
<li>Each of these symbols is wrapped with an exported Go function, that loads the library and checks for errors before calling into the generated bindings. So we would call <code>nvml.Init()</code> in our Go code.</li>
<li>This would lead to the generated bindings, which are what actually calls into CGO using <code>import &quot;C&quot;</code> and calls <code>C.nvmlInit_v2()</code>.</li>
</ol>
<h2 id="the-bug">The Bug</h2>
<p>A considerable amount of time has passed since this investigation took place, so I am writing with a ton of hindsight here. This explanation will obscure a ton of straw-grapsing, which you can look through in the <a href="https://github.com/golang/go/issues/63264">Go GitHub issue I opened</a>. For the sake of this post though, I&rsquo;m going to skip to the part where it all came together and the issue and solution became clear.</p>
<p>Ignoring the deep inner workings of how the NVML Go bindings work, I will focus on the most important core of it. This project generates C bindings based on an <a href="https://github.com/NVIDIA/go-nvml/blob/v0.12.0-1/gen/nvml/nvml.h">input header file</a>. This header file represents the accessible API for <code>libnvidia-ml.so.1</code>, a proprietary binary that is expected to be installed on the user&rsquo;s machine and loaded at runtime. It is not provided as part of the binding package, and will not be linked as a part of the build. To deal with this, the linker flag <code>--unresolved-symbols=ignore-in-object-files</code> is <a href="https://github.com/NVIDIA/go-nvml/blob/v0.12.0-1/pkg/nvml/nvml.go#L21">passed to the linker as part of the bindings</a>. This flag makes it so the symbols from <code>nvml.h</code>, which are not going to be resolved in the build with the shared object missing, will be ignored by the linker and not considered an error.</p>
<p>Our initial knowledge was that the bug occurred under the following circumstances:</p>
<ol>
<li>Using Go 1.21</li>
<li>Building on Ubuntu Jammy or newer, but not on earlier distros like Debian 10 Buster</li>
</ol>
<p>While at this point in the investigation a lot of these concepts were somewhat new to me, I did have a feeling that given the issue was with a dynamic library loaded through CGO, the issue probably had something to do with linking, and I suspected the version of <code>ld</code> on the system was the culprit, and that something in the CGO layer of Go had changed in conflict with a new version of <code>ld</code>. It took me a non-trivial amount of time to realize why, but this ended up mostly correct.</p>
<h3 id="standalone-repro">Standalone Repro</h3>
<p>In order to a) determine whether this was <code>go-nvml</code> specific or something inherent to Go, and b) to not require me to have NVIDIA libraries installed while developing, I created a <a href="https://github.com/braydonk/cgo_dl_repro">standalone reproduction</a>. This confirmed that setting up a small CGO program under the same circumstances (providing a header but no object and passing <code>--unresolved-symbols=ignore-in-object-files</code> to <code>ld</code>) panicked in the exact same way. We can work with this from here on out.</p>
<h3 id="comparing-go-120-to-121">Comparing Go 1.20 to 1.21</h3>
<p>Using the reproduction, I will build 2 binaries, one with Go 1.20 and one with Go 1.21.</p>
<p>The repro program includes a header that defines a function <code>get42</code> and makes a call to it. This symbol should be unresolved in the build, and should show up as such in our binary. If we use <code>nm</code> on the Go 1.20 binary, we can find our <code>get42</code> existing as expected as an unresolved symbol:</p>





<pre tabindex="0"><code>$ nm cgo_dl_repro_go120 | grep get42
0000000000483760 T _cgo_49665a31f432_Cfunc_get42
                 U get42
0000000000483580 t main._Cfunc_get42.abi0
000000000051b1c8 d main._cgo_49665a31f432_Cfunc_get42</code></pre><p>However, checking out the Go 1.21 binary shows an important difference, which is that this symbol is missing!</p>





<pre tabindex="0"><code>nm cgo_dl_repro_go121 | grep get42
000000000047ce70 T _cgo_49665a31f432_Cfunc_get42
000000000047cca0 t main._Cfunc_get42.abi0
000000000051b1a8 d main._cgo_49665a31f432_Cfunc_get42</code></pre><p>The only <code>get42</code> symbols are the CGO calls we make in the Go code and the symbol from the C code that CGO generates.</p>
<p>I did not fully grasp what I was looking at when I found this, but this turned out to be the important difference. The <code>get42</code> unresolved symbol being missing actually meant that the <code>get42</code> symbol <strong>did not have an entry in the PLT</strong>. This results in Go generating assembly for this program that looks like this (disassembled by <code>go tool objdump</code>):</p>





<pre tabindex="0"><code>TEXT _cgo_49665a31f432_Cfunc_get42(SB) 
  :0			0x47ce70		4154			PUSHQ R12			
  :0			0x47ce72		55			PUSHQ BP			
  :0			0x47ce73		53			PUSHQ BX			
  :0			0x47ce74		4889fb			MOVQ DI, BX			
  :0			0x47ce77		e88416feff		CALL _cgo_topofstack(SB)	
  :0			0x47ce7c		4989c4			MOVQ AX, R12			
  :0			0x47ce7f		31c0			XORL AX, AX			
  :0			0x47ce81		e87a31b8ff		CALL 0x0 &lt;-- EVIL!!!!	
  :0			0x47ce86		89c5			MOVL AX, BP			
  :0			0x47ce88		e87316feff		CALL _cgo_topofstack(SB)	
  :0			0x47ce8d		4c29e0			SUBQ R12, AX			
  :0			0x47ce90		892c03			MOVL BP, 0(BX)(AX*1)		
  :0			0x47ce93		5b			POPQ BX				
  :0			0x47ce94		5d			POPQ BP				
  :0			0x47ce95		415c			POPQ R12			
  :0			0x47ce97		c3			RET	</code></pre><p>And a reminder of what that panic looks like:</p>





<pre tabindex="0"><code>SIGSEGV: segmentation violation
PC=0x0 m=0 sigcode=1
signal arrived during cgo execution</code></pre><p>That explains how we&rsquo;re getting program counter <code>0x0</code>!</p>
<h3 id="the-solution">The Solution</h3>
<p>While I spent a considerable amount of time experimenting and looking through <code>go tool linker</code> and <code>cgo</code> source code to try and understand what was going on, and I did learn a lot, I ended up finding the problem with a good old fashioned <code>git bisect</code>. I ended up at commit <a href="https://github.com/golang/go/commit/1f29f39795e736238200840c368c4e0c6edbfbae">1f29f39</a>.<br>
The message of that commit: <code>cmd/link: don't export all symbols for ELF external linking</code><br>
The problematic code change was from this:</p>





<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-go" data-lang="go"><span style="display:flex;"><span><span style="color:#75715e">// Force global symbols to be exported for dlopen, etc.</span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">if</span> <span style="color:#a6e22e">ctxt</span>.<span style="color:#a6e22e">IsELF</span> {
</span></span><span style="display:flex;"><span>	<span style="color:#a6e22e">argv</span> = append(<span style="color:#a6e22e">argv</span>, <span style="color:#e6db74">&#34;-rdynamic&#34;</span>)
</span></span><span style="display:flex;"><span>}</span></span></code></pre></div><p>To this:</p>





<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-go" data-lang="go"><span style="display:flex;"><span><span style="color:#75715e">// Force global symbols to be exported for dlopen, etc.</span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">if</span> <span style="color:#a6e22e">ctxt</span>.<span style="color:#a6e22e">IsELF</span> {
</span></span><span style="display:flex;"><span>	<span style="color:#66d9ef">if</span> <span style="color:#a6e22e">ctxt</span>.<span style="color:#a6e22e">DynlinkingGo</span>() <span style="color:#f92672">||</span> <span style="color:#a6e22e">ctxt</span>.<span style="color:#a6e22e">BuildMode</span> <span style="color:#f92672">==</span> <span style="color:#a6e22e">BuildModeCShared</span> <span style="color:#f92672">||</span> !<span style="color:#a6e22e">linkerFlagSupported</span>(<span style="color:#a6e22e">ctxt</span>.<span style="color:#a6e22e">Arch</span>, <span style="color:#a6e22e">argv</span>[<span style="color:#ae81ff">0</span>], <span style="color:#a6e22e">altLinker</span>, <span style="color:#e6db74">&#34;-Wl,--export-dynamic-symbol=main&#34;</span>) {
</span></span><span style="display:flex;"><span>		<span style="color:#a6e22e">argv</span> = append(<span style="color:#a6e22e">argv</span>, <span style="color:#e6db74">&#34;-rdynamic&#34;</span>)
</span></span><span style="display:flex;"><span>	} <span style="color:#66d9ef">else</span> {
</span></span><span style="display:flex;"><span>		<span style="color:#a6e22e">ctxt</span>.<span style="color:#a6e22e">loader</span>.<span style="color:#a6e22e">ForAllCgoExportDynamic</span>(<span style="color:#66d9ef">func</span>(<span style="color:#a6e22e">s</span> <span style="color:#a6e22e">loader</span>.<span style="color:#a6e22e">Sym</span>) {
</span></span><span style="display:flex;"><span>			<span style="color:#a6e22e">argv</span> = append(<span style="color:#a6e22e">argv</span>, <span style="color:#e6db74">&#34;-Wl,--export-dynamic-symbol=&#34;</span><span style="color:#f92672">+</span><span style="color:#a6e22e">ctxt</span>.<span style="color:#a6e22e">loader</span>.<span style="color:#a6e22e">SymExtname</span>(<span style="color:#a6e22e">s</span>))
</span></span><span style="display:flex;"><span>		})
</span></span><span style="display:flex;"><span>	}
</span></span><span style="display:flex;"><span>}</span></span></code></pre></div><p>What does this mean? The code used to always pass the <code>-rdynamic</code> flag to <code>gcc</code>, which passes <code>--export-dynamic</code> to <code>ld</code> under the hood. The change for the code changed to only pass <code>-rdynamic</code> to <code>gcc</code> if the particular linker flag is not supported. The justification for this is in <a href="https://github.com/golang/go/issues/53579">this issue</a> (TL;DR it&rsquo;s because this is unnecessary in most cases and thus wastes space on a majority of binaries). While it&rsquo;s hard to know exactly when the <code>--export-dynamic-symbol</code> flag was added to <code>ld</code>, it seems like the only plausible reason that this issue only occurs on an <code>ld</code> version that is high enough.</p>
<p>Since <code>-rdynamic</code> is now not always being passed in the CGO build process, the change I ended up on was to modify the binding generation in <code>go-nvml</code> to <a href="https://github.com/NVIDIA/go-nvml/pull/79">always pass the <code>--export-dynamic</code> linker flag</a>. This doesn&rsquo;t break if the <code>-rdynamic</code> flag is passed, but ensures that we still have the required <code>ld</code> flag being passed in newer versions of Go and <code>ld</code>.</p>
<h2 id="conclusion">Conclusion</h2>
<p>This was a very hard issue to figure out, and was around a week&rsquo;s worth of effort. The solution was 16 characters. This is why it&rsquo;s hard to measure coding productivity by raw output! :)</p>
<p>I&rsquo;m still glad I went through all of it, and glad I went through the process of re-documenting it by writing up this post. Hopefully you got some enjoyment out of my adventure!</p>
]]></content:encoded>
    </item>
    <item>
      <title>Software Industry vs Software Education</title>
      <link>https://blog.ragecage64.com/blog/software-vs-education/</link>
      <pubDate>Fri, 08 Apr 2022 00:00:00 +0000</pubDate>
      <guid>https://blog.ragecage64.com/blog/software-vs-education/</guid>
      <description>&lt;p&gt;I&amp;rsquo;ve decided to put pen-to-paper (keyboard-to-markdown?) on a rant I&amp;rsquo;ve given to friends and colleagues numerous times since my University career ended. I want to talk about what I like to jokingly refer to as &amp;ldquo;the ticket to the industry&amp;rdquo;: the Bachelor&amp;rsquo;s Degree.&lt;br&gt;&#xA;If you pull up a software dev job posting and check the requirements, there is a ~99.999% chance that one of those requirements is a &amp;ldquo;Bachelor&amp;rsquo;s Degree in Computer Science or a related field&amp;rdquo;. If you&amp;rsquo;re lucky, it will add &amp;ldquo;or equivalent experience&amp;rdquo;.&lt;/p&gt;</description>
      <content:encoded><![CDATA[<p>I&rsquo;ve decided to put pen-to-paper (keyboard-to-markdown?) on a rant I&rsquo;ve given to friends and colleagues numerous times since my University career ended. I want to talk about what I like to jokingly refer to as &ldquo;the ticket to the industry&rdquo;: the Bachelor&rsquo;s Degree.<br>
If you pull up a software dev job posting and check the requirements, there is a ~99.999% chance that one of those requirements is a &ldquo;Bachelor&rsquo;s Degree in Computer Science or a related field&rdquo;. If you&rsquo;re lucky, it will add &ldquo;or equivalent experience&rdquo;.</p>
<h1 id="my-bachelors-degree">My Bachelor&rsquo;s Degree</h1>
<p>During my undergrad, I hated pretty much everything about school. I knew I loved Computer Science, and I was utterly committed to completing my degree, but I barely made. The system really felt like it was a carefully designed torture chamber made just for me to pay thousands of dollars to suffer in.<br>
I loved so many of the concepts and subjects I was learning in Computer Science and Math. However, particularly for Math, the shift in priorities coming to University were a shell shock. The goal of these classes didn&rsquo;t feel like learning anymore; they felt like a game to achieve the best mark. It was a game I sucked at. I don&rsquo;t think there&rsquo;s any words to effectively describe how bad I was at exams. I never properly learned how to cope with my intense distractibility and struggle to focus throughout school, and my memory for concepts I didn&rsquo;t deeply understand was incredibly fragile. My success in any courses, even Computer Science ones, hinged almost entirely on what percentage of the mark was derived from exams. Even worse were the courses that required you to pass the final exam to pass the course (this is the worst of the &ldquo;torture chamber designed for me&rdquo;). I failed 2 classes, both through final exams alone: Object-Oriented Programming (which I had done extensively even then, but choked writing Java on paper), and Probability (which terminated my then-burgeoning interest in Data Science).</p>
<p>I wouldn&rsquo;t think so much about the good old torture chamber now, considering I&rsquo;m years removed from receiving my degree, but I am constantly upset on reminder of what university could have been for me. My passion has shifted from just Software Development to deep Computer Science. My wife loves to poke fun at me for reading basically only CS textbooks, and my spare time is often spent learning about increasingly deep computer science concepts. While I was in school, all I could feel was the stress-rage-hybrid of looming exams, and the intense desire to be free and get what I really wanted; a job as a software developer. I&rsquo;m so grateful to have achieved that goal, but I still can&rsquo;t help but think about how different my life would have been if I hadn&rsquo;t had to go through something I was so bad at to get there.</p>
<p>Believe it or not, this article is not just for me to complain how much I hated university and exams (although it was cathartic to write, and I&rsquo;m leaving it in). Despite how much I hated it, I really can&rsquo;t blame University for being&hellip; University. The validity of the post-secondary system isn&rsquo;t really the dialogue I&rsquo;m going for (at least today). The real point here is that University was simply not for me.</p>
<h1 id="university-wasnt-for-me-but-what-choice-did-i-have">University wasn&rsquo;t for me, but what choice did I have?</h1>
<p>I was lead to believe that University was the only way to get into the software industry. When it came time to decide my future, there didn&rsquo;t seem to be an alternative. Even going for college seemed like a death knell for your chances to break into the industry, and I couldn&rsquo;t even fathom trying to get in as a self taught developer. These were obviously not true then, and are even less true now as the narrative around alternate paths to the industry has improved significantly. At the time however, I hadn&rsquo;t connected to any tech communities even online, and was far too sheepish to reach out for mentorship. My pipeline to the industry was driven largely by how my high school directed me and my vapid attempts to make video games on my own time. I could only consume the information and assumptions that were easiest available to me. It sure didn&rsquo;t seem like bad information at the time either; every job posting I checked required a Bachelor&rsquo;s Degree. EVERY one. It sure seemed like my only course of action.</p>
<h1 id="disdain-for-bachelors-degrees">Disdain for Bachelor&rsquo;s Degrees</h1>
<p>I have been waxing lyrical about my own woe-is-me relationship with University, but I&rsquo;m not alone. The glorified engagement farm widely known as tech twitter has been firing away the catchy tweets about how they got into the industry without a degree, or vaguely asking whether the twitter-verse thinks a degree is required to get a job as a developer. (I guess I shouldn&rsquo;t be so cynical about it, they are generating infinitely more clicks than this blog with no SEO will).</p>
<p>I&rsquo;ve spoken to many of my peers and colleagues in the community, and an (anecdotally) common sentiment is that they feel university did not prepare them adequately for the &ldquo;real world&rdquo; of software development. Common misgivings were the outdated technology used in courses, the heavy requirements for seemingly unrelated maths, and lacking guidance on realistic software industry skills (source control, software architecture, web development tooling).</p>
<p>It seems like so many people are on the same page about this in their own way: <strong>what is taught for a Bachelor&rsquo;s Degree seems to be heavily at odds with the standard industry requirement for it</strong>.</p>
<h1 id="do-universities-need-to-get-with-the-times">Do universities need to &ldquo;get with the times&rdquo;?</h1>
<p>You could get upset at post-secondary programs in general. Perhaps these programs need to teach more applicable, employable skills. Maybe they should be directing student learning toward more practical topics to increase their confidence to enter the industry. If these classes aren&rsquo;t teaching students what they feel will be useful for their jobs, then what&rsquo;s the point?</p>
<p>The counter to this is often to espouse the value of foundational knowledge. The things you learn in University may not be things most folks will do day to day in their careers, but these fundamental concepts are an effective way to become a well-rounded developer.<br>
So which of my strawmen is right?<br>
SIKE, they both are. Sort of.</p>
<h1 id="the-modern-software-industry">The modern software industry</h1>
<p>Software is a pretty young industry overall, however the prevailing goal of most software jobs has remained relatively constant: to create products that people use. While that goal hasn&rsquo;t changed much, the tools available to accomplish that goal have changed drastically. The advancement of developer-focused tools and frameworks has lead to a major shift in the kind of skills necessary to get started developing software. The tools at a developer&rsquo;s disposal have become so sophisticated that they abstract numerous fundamental building blocks that previously required deep knowledge to use. Web application frameworks blur the line between servers and clients; Kubernetes has made distributed systems a game of learning the available tools; UI design suites have broken the barrier between vision and implementation. (Disclaimer: I know all of these are severe oversimplifications). The general theme of modern tools is to abstract difficult foundational concepts to flatten the barrier to entry; your success in these tools would no longer hang on how well you understand the complex technical concepts it abstracts, and instead on how well you can learn the tool (which is usually a far faster process). I&rsquo;d argue there&rsquo;s very few tools that have fully achieved that goal, but I can feel the paradigm shift. Overtime, a new vacuum of the industry has formed entirely for talent with existing expertise in these specific modern tools.</p>
<p>Herein lies the crux of the problem; CS programs at post-secondary institutions produce well-rounded software developers who can leverage their foundational knowledge to numerous paths in software development, but their education may not have prepared them for the overwhelming number of jobs that require specific skills in industry tools. So many students simply wanted to start working in the industry, but the industry pushed them toward a seemingly false start.</p>
<h1 id="software-development-as-a-trade">Software Development as a Trade</h1>
<p>I think there&rsquo;s still a place for University. For these modern tools to exist as monolith abstractions of complicated foundations, there needs to be niche experts to build them. There are still a number of software jobs that would benefit greatly from folks with deeper academic knowledge. Still, a large number of jobs don&rsquo;t seem to be after academics, rather after software developers as practitioners of a trade. I think framing software development as a trade vs. an academic pursuit serves the needs of the modern industry pretty well. Software development as a trade is more like using the development of software as a means to an end to accomplish business goals. Practitioners of software development as a trade would be trained specifically in the relevant tooling, and their expertise would be catered to the needs of the industry. Software developers as academics, computer scientists if you will, would be the folks doing intense research and studying. They would be the experts building the bedrock of computing, and the tradespeople would be the experts bringing it to wider society.</p>
<p>What would this sort of separation gain us? For starters, there are more paths for future developers to enter the industry. Rather than University seeming like the only path forward, perhaps there could be a shorter trade program, or something like a bootcamp that trains people explicitly to become practitioners; they would start with the basics as any software education should, but provide a more direct path to preparing specifically for jobs in the industry. Such a large number of developers endeavour only to build great things and use software development as their tool to do that; more focused trades programs would get them to that goal faster. It would also increase the rate of new talent joining the industry, and that new talent would arguably be more primed to onboard to the average company building products with popular tools. This would still leave a place for universities not only to continue teaching the important required topics of a Computer Science degree, but even relieves pressure on them to conform to the needs of the industry. Developers with aspirations to learn specifically Computer Science can go into post-secondary, and those interested mainly in Software Development can pursue it as a trade.</p>
<p>I think we may sort of be heading in this direction already; I have never been to one, but bootcamps do seem to be similar in spirit to what I&rsquo;m trying to describe. I think one of the limiting factors for alternate paths to the industry is the persistent dogma around Bachelor&rsquo;s Degrees, and the usage of the degree as an arbitrary barrier for new folks to enter the industry. I think the biggest realistic step forward for the industry would be to not only acknowledge the validity of alternate paths, but also understand where they may be advantageous instead of simply settling for them.</p>
<p>If I&rsquo;m being honest with myself, much of this is pie-in-the-sky optimism; any immediate shifts like this would require disjointed demographics and organizations with different values to somehow shift their priorities in sync. We do appear to be taking baby steps though; there is a rise in vocal self-taught programmer pride, and an increasing number of developers are finding their way into the field through bootcamps and online courses.</p>
<p>To be Fair and Balanced though, this idea would be unlikely to become a utopia. It would help a lot more people enter our industry in ways that suit their goals and learning style, and would allow companies to hire in a way that more directly suits their requirements. However, in our present environment of late-stage capitalism -</p>
<h1 id="i-snuck-an-anti-capitalist-premise-into-my-blog-post">I SNUCK AN ANTI-CAPITALIST PREMISE INTO MY BLOG POST</h1>
<p>GOTCHA! YOU SHOULD SEE YOUR FACE RIGHT NOW!</p>
<p>In our present environment of late-stage capitalism, we apparently cannot get enough of social and class hierarchies. A field like software development attracts a lot of pearl clutching and vapid gatekeeping. A very loud minority of people are desperate to fight over the definition of a &ldquo;real developer&rdquo;, feeling personally offended and protective of the title because some developers never needed to untangle hundred line C++ template error messages. This sort of desperation to attain and maintain pseudo-intellectual superiority over each other would absolutely be exacerbated by a publicly accepted difference between &ldquo;software tradespeople&rdquo; and &ldquo;computer scientists&rdquo;. In the worst case (and probably most likely) scenario, companies will absolutely eat that up. Whatever their public messaging might be, they would likely use this difference to create new pay hierarchies, and find a metric-assload of creative ways to keep tradespeople underlevelled and underpaid. I think there&rsquo;s a very real chance it could devolve into a sort of class system among software developers, where university degrees arbitrarily earn smarmy confidence, and higher wages for similar work; arguably, this is even today&rsquo;s status quo because some people suck.</p>
<h1 id="conclusion">Conclusion</h1>
<p>I don&rsquo;t know that there is a perfect solution to the problematic relationship between our industry and Bachelor&rsquo;s Degrees. I&rsquo;m certainly not one to suggest sticking to something just because it&rsquo;s the way we&rsquo;ve always done it, so I can&rsquo;t help but ideate some perfect balance we may never truly achieve.<br>
If you&rsquo;ve clicked on this post there is a high chance we know each other and you&rsquo;ve already heard me say all this, but if not: first of all, hi! Thanks for reading! If you are a new software developer, I hope you don&rsquo;t feel as trapped as I did when I started, and that you are aware of the paths open to you. If you&rsquo;re already in University, I hope this doesn&rsquo;t somehow sour your perspective; I may have hated school, but I still consider it an incredibly valuable part of my life, and I hope it is the same for you. If you&rsquo;re someone who does hiring in any capacity, I hope this post inspires you, in however minor a way, to critically consider what you look for and how you can adjust to keep yourself open to the talent that&rsquo;s waiting to find you.<br>
Whoever you are, I hope you took something away whether you agree or disagree. As always, I&rsquo;m happy to discuss either way, because I don&rsquo;t claim to be in any way an expert and I would love to hear your thoughts.</p>
]]></content:encoded>
    </item>
    <item>
      <title>Self hosting with Caddy, gitea, hugo, bitwarden, and more!</title>
      <link>https://blog.ragecage64.com/blog/self-hosting-adventure/</link>
      <pubDate>Sat, 15 Jan 2022 00:00:00 +0000</pubDate>
      <guid>https://blog.ragecage64.com/blog/self-hosting-adventure/</guid>
      <description>&lt;p&gt;I have always wanted to try self-hosting things that are clearly better done by a SaaS provider. That&amp;rsquo;s why I took a few hours, a big ol&amp;rsquo; Ubuntu VPS, and a domain name to try and self-host a bunch of things I use every day! I might hate myself later, but I&amp;rsquo;m having fun for now. I decided to write a little bit about what I did to make everything work.&lt;/p&gt;</description>
      <content:encoded><![CDATA[<p>I have always wanted to try self-hosting things that are clearly better done by a SaaS provider. That&rsquo;s why I took a few hours, a big ol&rsquo; Ubuntu VPS, and a domain name to try and self-host a bunch of things I use every day! I might hate myself later, but I&rsquo;m having fun for now. I decided to write a little bit about what I did to make everything work.</p>
<h1 id="ufw-uncomplicated-firewall">UFW (Uncomplicated Firewall)</h1>
<p>If I had gone with a more fully-featured cloud hosting provider, such as <a href="https://www.digitalocean.com/">DigitalOcean</a> or <a href="https://www.linode.com/">Linode</a> (not affiliated with either), I would have been able to configure my VPS&rsquo;s firewall through a UI console. However, I had already purchased a really large server for cheaper with another provider. This meant I needed to set up my firewall right on the server myself. As a software developer, I am horrible at SysAdmin by nature; the idea of setting up critical iptables shook me to my very core. This was why I chose to configure my firewall with the easier to use ufw.</p>
<p>The setup I needed was as follows: deny all incoming traffic by default, allow all outgoing by default, then allow traffic on the ports I needed (namely SSH, HTTP, and HTTPS).<br>
What scared me the most was potentially locking myself out of my server. Working directly with iptables put me at risk of this, as iptable rules operate at the kernel level. If I changed one wrong thing, I could completely lock myself out of my server. ufw didn&rsquo;t have this problem, because it runs as a service; I could configure all of my ufw and start the service when I felt everything was ready. I did a dry-run on a temporary tiny VPS to make sure I wouldn&rsquo;t lock myself out (<code>sudo</code> removed for brevity):</p>





<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bash" data-lang="bash"><span style="display:flex;"><span>ufw default deny incoming
</span></span><span style="display:flex;"><span>ufw default allow outgoing
</span></span><span style="display:flex;"><span>ufw allow ssh
</span></span><span style="display:flex;"><span>ufw allow http
</span></span><span style="display:flex;"><span>ufw allow https
</span></span><span style="display:flex;"><span>ufw allow <span style="color:#ae81ff">25565</span> <span style="color:#75715e"># I have a Minecraft server running on this machine already!</span>
</span></span><span style="display:flex;"><span>ufw enable</span></span></code></pre></div><p>After doing this, I disconnected and tried to ssh into the server again. It worked as expected, and now that I&rsquo;d verified it worked on the test VPS, I ran them on my main VPS with similar success.</p>
<p>Time to come clean though; this wasn&rsquo;t the first thing I did. This was actually one of the last things I did (hence why I was extra scared of locking myself out). The main reason I did this was to make sure my server wouldn&rsquo;t accept connections to <code>http://&lt;ip&gt;:&lt;port&gt;</code>. This worked for some of my services, but not for one of them. The reason it didn&rsquo;t work is because ufw by default cannot stop Docker from accepting connections directly to published ports. On install, Docker makes entries in the iptable rules that are evaluated before ufw&rsquo;s. I tried a veritable cornucopia of bad solutions before finding <a href="https://github.com/chaifeng/ufw-docker">this repo</a> that provided the perfect solution for me.</p>
<p>With my server locked down (let&rsquo;s pretend that&rsquo;s the first thing I did, like it should have been) it was time to move on to the server I decided to use for reverse proxying.</p>
<h1 id="caddy">Caddy</h1>
<p>In my previous attempts at hosting things myself, I had fumbled through nginx reverse-proxy tutorials. While nginx is a great skill to learn, and an incredibly mature tool, I decided to take a different route this time and use Caddy. I acknowledge that nginx is great technology, but after using it on this server I am officially sold on Caddy.</p>
<p><img src="/caddy-best-friend.jpg" alt="friendship ended with nginx, now caddy is my best friend"></p>
<p>The two reasons I love Caddy are its <a href="https://caddyserver.com/docs/caddyfile">super easy configuration</a> and its <a href="https://caddyserver.com/docs/automatic-https">automatic https</a>. The biggest challenges I had with hosting things myself in the past is partially my poor nginx configuration abilities, but largely that messing with <a href="https://certbot.eff.org/">certbot</a> (an admittedly great and easy to use project) was a lot more work than I wanted to constantly manage for every single project that I wanted to host. HTTPS is a process that can be automated, and Caddy proves that. Now, I simply add a new site configuration block to my <code>Caddyfile</code> and I already automatically have HTTPS for that site (provided the domain name I specified has DNS configured correctly, more on that later).</p>
<p>I installed Caddy on my system through the stable <a href="https://caddyserver.com/docs/install#debian-ubuntu-raspbian">apt repo</a>. I used it as a systemd service, and added configuration to the <code>/etc/caddy/Caddyfile</code>. When I mention &ldquo;adding something to the Caddyfile&rdquo; further down this article, I am referring to editing this file and running <code>sudo systemctl restart caddy</code> to reload the configuration.</p>
<h1 id="gitea">Gitea</h1>
<p>The first thing I wanted to set up was my own git server! I used a fantastic open-source project called <a href="https://gitea.io/en-us/">Gitea</a> that replicates a lot of GitHub&rsquo;s features. It&rsquo;s missing some of the more advanced GitHub features, but as a place to toss my personal project code it seemed perfect.</p>
<p>I installed Gitea through a user-maintained <a href="https://gitlab.com/packaging/gitea">deb package</a>. Honestly, if I were starting from the top, I probably would just install it <a href="https://docs.gitea.io/en-us/install-with-docker/">through Docker</a>, but this is working fine for now anyway. Installing this package created a <code>gitea</code> user for me, so I created the necessary directories from <a href="https://docs.gitea.io/en-us/install-from-binary/#prepare-environment">this Gitea tutorial</a> and gave the <code>gitea</code> user access instead of the <code>git</code> user that these docs suggest manually creating. The rest of the steps from then on in the docs ended up working for me. So now I was able to run the <code>gitea</code> systemd service on port 3000. The next step was to set up the reverse proxy so I could get into my Gitea instance.</p>
<p>I have a domain (<code>ragecage64.com</code>) with Google Domains, however I don&rsquo;t use anything specific to that system. All I had to do was a DNS Address record for the <code>git</code> subdomain I wanted. It looked something like this:</p>
<p><img src="/git-dns.jpg" alt="the git DNS record"></p>
<p>Once this was set up, I added the following block to my Caddyfile:</p>





<pre tabindex="0"><code>git.ragecage64.com {
    reverse_proxy localhost:&lt;gitea port&gt;
}</code></pre><p>After restarting caddy, I had <a href="https://git.ragecage64.com">https://git.ragecage64.com</a> ready to go! There was a first-time set up screen that I forgot to take a screenshot of, but is pretty self-explanatory. I spent a little bit of time selectively migrating the repos I wanted to keep to my new Gitea instance, and setting up my SSH key and new username. Really loving Gitea so far!</p>
<h1 id="get-files-from-gitea-repos-within-the-server">Get files from Gitea repos within the server</h1>
<p>This was an important step to the next couple things I&rsquo;m going to talk about. Once things are pushed to my Gitea instance, I&rsquo;m able to access those files on the server to perform any kinds of builds I may need to run them.<br>
To do this, figure out where you Gitea is storing its repos (for me it was in the default directory <code>/var/lib/gitea/data/gitea-repositories/&lt;gitea username&gt;</code>). In this folder you can find the <a href="https://mijingo.com/blog/what-is-a-bare-git-repository">bare repositories</a>. To get the data from these bare repositories from any where on your server, you can clone the bare repository, i.e. <code>git clone $GITEA_REPO_DIR/&lt;your repo&gt;.git</code>. Now you have a copy of the code on your server to do with whatever you please.</p>
<h1 id="bub-the-discord-bot">Bub the Discord Bot</h1>
<p>This was probably the easiest thing to set up. My bot is written in Go, meaning all I need is a Discord Bot Token in my local env and to run the compiled program. First, I <a href="https://git.ragecage64.com/RageCage64/bub-the-bot">pushed the bot code to my Git</a>. Then I pulled the bare repo on my server, ran the command in the Makefile, and ran the compiled binary. Pretty simple setup!</p>
<p>The challenge came when I needed to run multiple apps at once on my server. The more correct thing would probably be to create systemd services out of everything, but that&rsquo;s hard. :)<br>
The two things I needed to run and quickly get at logs for are my Minecraft server and Bub. I used multiple <a href="https://www.gnu.org/software/screen/">GNU screen</a> sessions to accomplish this. I started named screen sessions like so:</p>





<pre tabindex="0"><code>screen -S minecraft</code></pre><p>Once I created the screen session, I ran the server and detached from the session with <code>Ctrl+A, D</code>.<br>
Then when I needed to reattach to the screen, I could use the command:</p>





<pre tabindex="0"><code>screen -xS minecraft</code></pre><p>I did the same thing with my bot. Pretty good setup overall!</p>
<h1 id="this-blog">This Blog</h1>
<p>I am now also hosting this blog on my server! This blog is a static site made with <a href="https://gohugo.io/">Hugo</a>, which I highly recommend if you&rsquo;re looking to make a blog. The great thing about this was that a static site is similarly easy to host through Caddy!<br>
I started by installing hugo, cloning the bare repo (I had to <code>--recurse-submodules</code> because I installed a theme, important step), and running <code>hugo</code>. This built my site to the <code>public</code> folder (optionally, could output this public folder to a smarter place in the server). Next, I added the following block to my Caddyfile:</p>





<pre tabindex="0"><code>blog.ragecage64.com {
	root * &lt;full path to site folder&gt;/public
	file_server
}</code></pre><p>And added the similar DNS record as above.<br>
Now I have the site you are currently on! To update my blog now I push to my repo, pull the server copy of the repo, and run <code>hugo</code>. A bit more work than GitHub Pages where I previously hosted this site, but every part of this is more work than it used to be and I&rsquo;m still having fun!</p>
<h1 id="bitwarden">BitWarden</h1>
<p>The last thing I got working was my own BitWarden instance to share with my partner and family. To do this, I decided to run a <a href="https://hub.docker.com/r/bitwardenrs/server">docker container of the Rust implementation</a> of Bitwarden. I created a docker-compose file for the container (which maybe wasn&rsquo;t necessary because I&rsquo;m just using SQLite anyway, but that makes it easier to add a real DB later) and ran it in the background with <code>docker-compose up -d</code>. I then created a DNS record and Caddy reverse proxy similar to Gitea above, and followed the <a href="https://bitwarden.com/help/article/change-client-environment/">instructions to connect to BitWarden clients to my instance</a>. When I first started the instance, I used the container environment variable <code>SIGNUPS_ALLOWED=true</code>. This allowed me and my partner to quickly sign up, before I restarted the container with this environment variable set to false. This means only the people I want to sign up for my instance can; it&rsquo;s only on a SQLite database, it&rsquo;s not exactly web scale!</p>
<h1 id="who-knows-what-else">Who knows what else!</h1>
<p>Now I have an easy to way to host any future projects on one server! It&rsquo;s pretty exciting, and I don&rsquo;t know what&rsquo;s going up next, but next time I think of something exciting it&rsquo;s fun to know I always have somewhere to put it!</p>
]]></content:encoded>
    </item>
    <item>
      <title>colors and faker: a case study on the npm ecosystem</title>
      <link>https://blog.ragecage64.com/blog/colors-and-faker/</link>
      <pubDate>Mon, 10 Jan 2022 00:00:00 +0000</pubDate>
      <guid>https://blog.ragecage64.com/blog/colors-and-faker/</guid>
      <description>&lt;h1 id=&#34;foreword&#34;&gt;Foreword&lt;/h1&gt;&#xA;&lt;p&gt;For years I&amp;rsquo;ve listened to software engineers more experienced than myself poke fun at the &lt;a href=&#34;https://www.theregister.com/2016/03/23/npm_left_pad_chaos/&#34;&gt;left-pad incident&lt;/a&gt;. Usually used as a joking throwaway comment about keeping package-lock files in sync, or in accordance with the &lt;a href=&#34;https://xkcd.com/2347/&#34;&gt;related xkcd comic&lt;/a&gt; (which seems to get more relevant the older it gets). It was technically just before my time as a professional developer (my less-than-stellar jQuery experimentation was safe from this at the time), so I would take it as a cautionary tale that taught us an important lesson about the software supply chain.&lt;br&gt;&#xA;It also informed a lot of the learning I have done over the years about what it means for software to be open source, the nuances of open source software licensing, and the &lt;a href=&#34;https://www.wired.com/2006/09/free-as-in-beer/&#34;&gt;difference between freedom and beer&lt;/a&gt;. I&amp;rsquo;ve always been passionate about software that is at the very least source-available; the collaboration between so many talented and passionate people has always felt like something of a panacea to me (depending how rosy my glasses are that day).&lt;/p&gt;</description>
      <content:encoded><![CDATA[<h1 id="foreword">Foreword</h1>
<p>For years I&rsquo;ve listened to software engineers more experienced than myself poke fun at the <a href="https://www.theregister.com/2016/03/23/npm_left_pad_chaos/">left-pad incident</a>. Usually used as a joking throwaway comment about keeping package-lock files in sync, or in accordance with the <a href="https://xkcd.com/2347/">related xkcd comic</a> (which seems to get more relevant the older it gets). It was technically just before my time as a professional developer (my less-than-stellar jQuery experimentation was safe from this at the time), so I would take it as a cautionary tale that taught us an important lesson about the software supply chain.<br>
It also informed a lot of the learning I have done over the years about what it means for software to be open source, the nuances of open source software licensing, and the <a href="https://www.wired.com/2006/09/free-as-in-beer/">difference between freedom and beer</a>. I&rsquo;ve always been passionate about software that is at the very least source-available; the collaboration between so many talented and passionate people has always felt like something of a panacea to me (depending how rosy my glasses are that day).</p>
<p>This is all to say that reading about what happened with the npm packages <code>colors</code> and <code>faker</code> left me with a lot to say. I would have gone the lazy route and tweeted my thoughts to the void as usual, however I haven&rsquo;t posted to this blog in <em>checks notes</em> 10 months! Some content creator I am. The shareholders will have my head!</p>
<p>So without further ado, I&rsquo;d like to take as nuanced a look as I can at all the moving pieces of this fascinating case study.</p>
<h1 id="what-happened-with-colors-and-faker">What happened with <code>colors</code> and <code>faker</code>?</h1>
<p>The headline is not clickbait enough to attract anyone who does not already know about this situation (other than my proofreading partner, hi dear!). However, for the purposes of this post, I&rsquo;m going to pretend you have no idea what&rsquo;s going on and summarize quickly so we can build some context.</p>
<p><a href="https://www.npmjs.com/package/colors"><code>colors</code></a> is an npm package that enables the user to colour their console text in their command line applications. Command line applications may not be the first thing that come to mind when you think of Node.js, but a vast majority of JavaScript dev tools have a command line interface and leverage this package to improve the appearance of their output.</p>
<p><code>faker</code> (no link for this one; will explain shortly) is an npm package that will randomly generate data, however this data is believable; it falls into common data patterns like names, street addresses, movie quotes, etc. I am not exactly sure which was first, but this library was heavily inspired by counterparts in other languages such as Perl, Ruby, PHP, and Python.</p>
<p>These packages are authored and maintained by the same developer: Marak Squires (<a href="https://github.com/Marak">see his GitHub</a>). These packages were used by thousands of Node.js applications, all published to and subsequently downloaded from the node package manager&rsquo;s central repository. This large repository of packages is owned by npm, Inc. and GitHub, and is the source from which virtually every node application pulls at least some open source dependencies. <code>colors</code> and <code>faker</code> were both open source and published with the <a href="https://opensource.org/licenses/MIT"><code>MIT</code> License</a>.</p>
<p>Last year, the author of these two packages decided that he was no longer interested in developing and maintaining the packages. They opened <a href="http://web.archive.org/web/20210704022108/https://github.com/Marak/faker.js/issues/1046">this issue</a>, declaring that they would no longer be working on the package. Last week, they took this a step further: they intentionally <a href="https://github.com/Marak/colors.js/commit/074a0f8ed0c31c35d13d28632bd8a049ff136fb6">introduced an infinite loop with spooky text in <code>colors</code></a> and, as we zoomers might say, <a href="https://www.npmjs.com/package/faker">yeeted <code>faker</code> from existence</a> (not really, but I will expand on that later on). This affected thousands of Node.js applications, which means it affected a ton of developers and companies of all sizes. And I mean &ldquo;all sizes&rdquo;; one of the affected packages I am personally familiar with is Amazon&rsquo;s <a href="https://github.com/aws/aws-cdk/commit/b851bc340ce0aeb0f6b99c6f54bceda892bfad0e"><code>aws-cdk</code></a>, and this is just one of many widely used packages that were essentially bricked until the issue was resolved.</p>
<p>Now that we have a general idea of what happened, I&rsquo;d like to add my interpretation of what it means to work with npm.</p>
<h1 id="what-does-it-mean-to-download-an-npm-package">What does it mean to download an npm package?</h1>
<p>One of the earliest lessons I learned when I first started using <a href="https://www.reddit.com/r/copypasta/comments/czef0u/id_just_like_to_interject_for_a_moment/">Linux</a> is to not download and execute random scripts without reading them first and understanding the risk. They require that you make a conscious decision to trust the source of the script. This makes sense when you think about it; a bash script (especially with sudo permissions) has the power to do an incredible amount of damage things to your system (or maybe just <a href="https://en.wikipedia.org/wiki/Fork_bomb">fork bomb</a> you as an epic prank, anything goes). Usually, where  possible, you were encouraged to install your software through your distribution&rsquo;s central package repository. This large repository of packages, all built specifically for the distribution, is owned and maintained by a dedicated group of volunteers or employees who vet and approve each one. There are ways for independent users or organizations to host their own repositories of packages, and integrate with the distribution&rsquo;s respective package managers. These require that you trust their source, similarly to downloading and executing bash scripts.</p>
<p>The reason for this tangent is to relate it back to <code>npm install</code>ing a package. Installing a package through npm is a combination of these two flavours of installing software; it is a package manager similar to the ones commonly included in Linux distributions, however each package in its central repository is not vetted and managed by a group of volunteers or employees. When you download and execute a package from npm&rsquo;s central repository, you are trusting the author of that package.</p>
<p>Now it&rsquo;s obviously a bit extreme to directly equate installing an npm package to <code>sudo</code> executing a bash script. It&rsquo;s a lot more nuanced than this since the most popular packages in the repository also have the most security experts&rsquo; eyes on them at all times. They may not be constantly approved by a central group of people, but in a perfect world issues are swiftly reported and dealt with by package maintainers and consumers of the package. npm also has a number of mechanisms to keep dependencies at a certain version until you trust that an upgrade is up to your standards.</p>
<h1 id="what-does-it-mean-to-publish-an-npm-package">What does it mean to publish an npm package?</h1>
<p>Every JavaScript package on npm is open source by nature. JavaScript is an interpreted language, and no amount of obfuscation will truly hide the JavaScript being shipping when a package is published to the central repository. This code can be licensed under any open source license that suits the project&rsquo;s needs. This license legally defines the way that the copyright holder approves the code to be used. Once the license (or lack of one) is defined, and the <a href="https://docs.npmjs.com/cli/v8/commands/npm-publish">minimum setup requirements</a> are present, you are free to publish whatever you would like. You could publish the next big JavaScript framework, a useful new CLI tool, <a href="https://docs.npmjs.com/cli/v8/commands/npm-publish">nothing</a>, whatever you&rsquo;d like provided it is legal.</p>
<h1 id="how-weve-forgotten-this">How we&rsquo;ve forgotten this</h1>
<p>Node.js and npm are tools that have come about as close to ubiquity as a very short list of technologies ever have. The number of new developers who&rsquo;s first step of their journey was/will be to run <code>npm install</code> is staggering. The largest companies in the world continue to rely on npm in varying capacities, and have contributed a large number of popular packages to its ecosystem. When such an apparent consensus of people are doing something, it&rsquo;s easy to interpret some guarantee of safety. You wouldn&rsquo;t jump off a bridge just because someone else did, but if 10 000 people jump off a particular bridge every day it&rsquo;s gotta be safe, right?</p>
<h1 id="one-way-trust">One-way Trust</h1>
<p>Let&rsquo;s have a quick look at that <a href="https://opensource.org/licenses/MIT"><code>MIT</code> license</a> again. The first line of this license states: &ldquo;Permission is hereby granted, free of charge, to any person obtaining a copy of this software[&hellip;] to deal in the Software without restriction[.]&rdquo;. Further down, the license states: &ldquo;[sic, all caps] THE LICENSE IS PROVIDED &ldquo;AS IS&rdquo;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED[.]&rdquo;.<br>
I am not a lawyer, but my interpretation of this in plainer terms is that anyone is allowed to use this software as they see fit, but that the copyright holder provides no guarantee of anything that may come with that freedom. If it doesn&rsquo;t work for some reason, doesn&rsquo;t do what you want, or has some kind of major flaw, the copyright holder does not bear the responsibility. When some of the biggest packages on npm are maintained by such large, efficient, and dedicated teams, it is easy to forget that warranty of any kind is almost never included by default; that&rsquo;s the one of the exchanges that is made when the software does not cost anything.</p>
<h1 id="how-this-relates-to-colors-and-faker">How this relates to <code>colors</code> and <code>faker</code></h1>
<p>All of this is to contextualize my general takeaway from this situation; the author of <code>colors</code> and <code>faker</code> were within their right to self-sabotage their packages. A quick disclaimer that I personally dislike what they decided to do, and I&rsquo;ll shortly expand on that, but I want to explain why I think what they did is within their right.<br>
I think this goes right to the root of what open source is, distilled to its core: some person writes code, they put it somewhere for the public to see, the public can do with it whatever the license permits. To be reductionist for the sake of brevity (let&rsquo;s laugh and pretend I have that), I think every single piece of open source code reduces down to this. Even if code is published by a Fortune 500 company who have great motivation to maintain that software until the end of time, even if software is so popular that volunteers will diligently shepherd it along the moral high ground to the heat death of the universe, open source software is still this at its core. When you use this published code, you have made the implicit decision to trust the person who published it. The only binding promise that the copyright holder have made in return is that the code exists and you can use it.</p>
<p>Marak wrote these packages and published them under the MIT license. They can update this code however they want. If they want to intentionally corrupt the package making it print <code>LIBERTY</code> a bunch of times to scare the bejeezus out of me, it appears to me that they are within their right to do so. I saw a number of people on Twitter decrying a &ldquo;breach of trust&rdquo;, and how it ruins the image of the npm ecosystem for a package author to do this. In my opinion, something is only a &ldquo;breach&rdquo; of trust when that trust is established two way and both parties are responsible for it. <strong>Package authors are not responsible for the one-way trust you placed in them, nor are they responsible for the effect their actions have on the reputation of the npm ecosystem.</strong></p>
<p>The way Marak went about this was pretty &ldquo;chaotic evil&rdquo; for my tastes, and I don&rsquo;t personally appreciate that they broke the trust so many people placed in them. However, I think a large number of us have forgotten that the people we download all these gigantic dependencies from are not in any way obligated to maintain a relationship of trust, and on paper could go nuclear at any time. The modern software supply chain has led to us unknowingly permitting this at an unprecedented scale for decades.</p>
<h1 id="you-are-being-ridiculous">You are being ridiculous</h1>
<p>Yeah, sort of. I am being pretty &ldquo;doom and gloom&rdquo; on purpose to frame this crucial portion of the software supply chain in a particular way. Let&rsquo;s come back down to reality for a bit to talk about what all of this means for well meaning software developers just trying to get their job done.</p>
<h1 id="how-to-improve-npm-safety-while-still-using-npm">How to improve npm safety (while still using npm)</h1>
<p>I should preface this section by clarifying that I&rsquo;m relatively new to this problem at scale, and there are experts far smarter than myself working to solve and educate on these problems. I&rsquo;d still like to close this article off with tips that I have for developers who are worried about how to protect themselves further in the future.</p>
<p>I imagine there are a number of you reading this who are upset that I am appearing to suggest every line of open source code you pull down be audited. We all know that&rsquo;s really not feasible at any scale larger than &ldquo;demo&rdquo;. This is why there are so many tools, such as <a href="https://www.sonarqube.org/">sonarqube</a>, <a href="https://snyk.io/">snyk</a>, and every project&rsquo;s most diligent contributor <a href="https://github.com/features/security">dependabot</a> (I am not affiliated with any, just a few I&rsquo;m familiar with) built with features to track and audit dependencies you&rsquo;ve brought in that may contain vulnerabilities. However, these tools don&rsquo;t necessarily help if you&rsquo;ve accidentally pulled in a bad dependency during development.</p>
<p>When a package is published on npm, save for select circumstances, the version published is there in perpetuity unless npm decides to take it down. Even though <code>faker@6.6.6</code> which essentially deletes all of its code is published on npm, it does not remove the history of <code>faker</code> releases. Code can only be <a href="https://docs.npmjs.com/unpublishing-packages-from-the-registry">unpublished from the registry</a> if the package has no dependents, which <code>faker</code> had a number of. In this case, and the case of <code>colors@1.4.44-liberty-2</code>, npm provides the tools to protect against these releases if you are a direct dependent.<br>
If you are a newer developer, I recommend understanding <a href="https://semver.org/">semantic versioning</a> as fully as you can; it is one of the greatest defenses to much of what I&rsquo;ve mentioned in this article. The most common practice when using semantic versioning is to use the <code>^</code> caret prefix on most of your dependencies because this is what npm does by default when installing a new dependency. It means that any updates to the major version will not be installed, but the latest release of that major version will be used. Similarly, there is the <code>~</code> tilde prefix is similar, which will not allow any updates to the minor version. Providing no prefix will pin a dependency at a particular version. If you aren&rsquo;t already, it is highly recommended to use more discretion when deciding which prefix to use on new and existing dependencies you choose to bring in.<br>
An important caveat here is that even people who were more reserved by only allowing patch releases of <code>colors</code>, which should suggest only bringing in bug/vulnerability fixes, still got screwed here by unexpectedly allowing a <em>very</em> breaking change. However, this defense is still good against typical benign cases.</p>
<p>The issue with increased discretion is it usually means you have to do more manual work when it&rsquo;s time to update. Working in the Node.js ecosystem is implicitly accepting that everything moves incredibly fast, and it&rsquo;s a danger to your application&rsquo;s continued health to let things fall too far out of date. While far from a perfect solution, one of my favourite ways to combat this is <a href="https://www.npmjs.com/package/npm-check-updates"><code>npm-check-updates</code></a>. It provides an optional interactive environment to select updates to packages that you feel confident are safe. It is a nice convenience in a process that haunts Node.js developers everywhere.</p>
<p>The sad truth is that there probably is no true way to stop this from affecting you. Semantic versioning is the biggest help when you are a direct dependent of the code you are trying to control. Unfortunately, npm package dependency graphs can go many layers deeper than we bargain for. If you pulled in even one odd dependency that doesn&rsquo;t pin some sub-dependency nicely, and the sub-dependency becomes problematic, you could have an issue that you often can&rsquo;t directly do anything about. This can lead to a frustrating amount of work, and what feels like a lack of control over your codebase if it happens often. For this one, I don&rsquo;t have a great solution. I wish I did, because it&rsquo;s a problem I have had for most of my time in the industry. I wouldn&rsquo;t want to say that a great solution doesn&rsquo;t exist somewhere, but it&rsquo;s probably going to be a burden we have to bear in the Node.js ecosystem to remain safe and secure. The biggest piece of advice is to make sure you are controlling your direct dependencies as tightly as possible the more strict your security requirements are; even when a dependency pulls in a bad sub-dependency, you can protect against the ripple effect if you keep tight control over when you bring the direct dependency in.</p>
<h1 id="conclusion">Conclusion</h1>
<p>I think what happened with <code>colors</code> and <code>faker</code> is a fascinating case study into how many of us have become complacent with npm&rsquo;s hidden safety concerns. I love open source software, and I believe we can all do our part to ensure we use it safely. I hope this article provided a new perspective to the situation, and whether you agree or disagree feel free to reach out and discuss! I am interested to hear about your experiences.</p>
]]></content:encoded>
    </item>
    <item>
      <title>The death of the for loop?</title>
      <link>https://blog.ragecage64.com/blog/death-of-for-loop/</link>
      <pubDate>Sat, 13 Mar 2021 00:00:00 +0000</pubDate>
      <guid>https://blog.ragecage64.com/blog/death-of-for-loop/</guid>
      <description>&lt;p&gt;NOTE (Feb 15, 2025): I think this post kinda sucks and I largely disagree with a majority of it now. I&amp;rsquo;ve decided to keep it here for posterity, but my modern sensibilities no longer line up with what I wrote here.&lt;/p&gt;&#xA;&lt;hr&gt;&#xA;&lt;p&gt;Generally introduced to new developers around chapter 4 or 5 of their proverbial Intro to Computer Science books, loops are one of the most fundamental coding constructs a developer learns. The different simple ways we iterate over collections of data are often the core of the most complex applications ever built. This is to dramatically justify the probably-overkill rant I am about to write regarding iterating over a collection of data.&lt;/p&gt;</description>
      <content:encoded><![CDATA[<p>NOTE (Feb 15, 2025): I think this post kinda sucks and I largely disagree with a majority of it now. I&rsquo;ve decided to keep it here for posterity, but my modern sensibilities no longer line up with what I wrote here.</p>
<hr>
<p>Generally introduced to new developers around chapter 4 or 5 of their proverbial Intro to Computer Science books, loops are one of the most fundamental coding constructs a developer learns. The different simple ways we iterate over collections of data are often the core of the most complex applications ever built. This is to dramatically justify the probably-overkill rant I am about to write regarding iterating over a collection of data.</p>
<p>Truthfully, this title is a misnomer. I don&rsquo;t think <code>for</code> loops need to die. My goal with this post is to present a case for the available alternatives to traditional <code>for</code> loops. Though they aren&rsquo;t technically wrong, I hope to demonstrate the benefits of the alternatives and how I believe they contribute to the enhancement of code quality.
(Ha, my first clickbait title. Unfortunately, this site earns me nothing. Your click paid me $0.)</p>
<p>I will use JavaScript for the code examples since types aren&rsquo;t going to be a factor here, and I feel it&rsquo;s the simplest language to convey the concepts in the post. I will stay as language-agnostic as possible.</p>
<h1 id="the-traditional-for-loop">The traditional <code>for</code> Loop</h1>
<p>I&rsquo;ll start by laying out the traditional <code>for</code> loop everyone knows and loves. While learning the fundamentals of coding you will write your first <code>for</code> loops like this.</p>





<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-javascript" data-lang="javascript"><span style="display:flex;"><span><span style="color:#66d9ef">for</span> (<span style="color:#66d9ef">let</span> <span style="color:#a6e22e">i</span> <span style="color:#f92672">=</span> <span style="color:#ae81ff">0</span>; <span style="color:#a6e22e">i</span> <span style="color:#f92672">&lt;</span> <span style="color:#ae81ff">10</span>; <span style="color:#a6e22e">i</span><span style="color:#f92672">++</span>) {
</span></span><span style="display:flex;"><span>	<span style="color:#75715e">// Code to execute on each iteration
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>}</span></span></code></pre></div><p>I still think this should be the first loop a new developer learns. It&rsquo;s easy to understand; execute the code inside the braces 10 times. Once the developer gets to arrays, and they learn that arrays are addressed 0, 1, 2 etc. to retrieve data, the use case of <code>for</code> loops suddenly clicks:</p>





<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-javascript" data-lang="javascript"><span style="display:flex;"><span><span style="color:#66d9ef">const</span> <span style="color:#a6e22e">arr</span> <span style="color:#f92672">=</span> [<span style="color:#ae81ff">1</span>, <span style="color:#ae81ff">2</span>, <span style="color:#ae81ff">3</span>, <span style="color:#ae81ff">4</span>, <span style="color:#ae81ff">5</span>];
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">for</span> (<span style="color:#66d9ef">let</span> <span style="color:#a6e22e">i</span> <span style="color:#f92672">=</span> <span style="color:#ae81ff">0</span>; <span style="color:#a6e22e">i</span> <span style="color:#f92672">&lt;</span> <span style="color:#a6e22e">arr</span>.<span style="color:#a6e22e">length</span>; <span style="color:#a6e22e">i</span><span style="color:#f92672">++</span>) {
</span></span><span style="display:flex;"><span>	<span style="color:#a6e22e">console</span>.<span style="color:#a6e22e">log</span>(<span style="color:#a6e22e">arr</span>[<span style="color:#a6e22e">i</span>]);
</span></span><span style="display:flex;"><span>}</span></span></code></pre></div><p>&ldquo;The loop runs from 0 to 4, so I can use <code>i</code> to choose an item from the array on each iteration of the loop! Now I understand what loops are for!&rdquo; - A dramatic re-enactment of me getting to the array lesson in my first coding book, feeling like a genius.<br>
It&rsquo;s understandable why a developer would reach for this by default; any professional developer is certain to have written hundreds of <code>for</code> loops exactly like this throughout their career, so there is rarely anything new for a developer to learn or understand.</p>
<p>Why fix what isn&rsquo;t broken?</p>
<h1 id="foreach-loops"><code>foreach</code> loops</h1>
<p>How many times have you seen a loop like this?</p>





<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-javascript" data-lang="javascript"><span style="display:flex;"><span><span style="color:#66d9ef">const</span> <span style="color:#a6e22e">arr</span> <span style="color:#f92672">=</span> [<span style="color:#ae81ff">1</span>, <span style="color:#ae81ff">2</span>, <span style="color:#ae81ff">3</span>, <span style="color:#ae81ff">4</span>, <span style="color:#ae81ff">5</span>];
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">for</span> (<span style="color:#66d9ef">let</span> <span style="color:#a6e22e">i</span> <span style="color:#f92672">=</span> <span style="color:#ae81ff">0</span>; <span style="color:#a6e22e">i</span> <span style="color:#f92672">&lt;</span> <span style="color:#a6e22e">arr</span>.<span style="color:#a6e22e">length</span>; <span style="color:#a6e22e">i</span><span style="color:#f92672">++</span>) {
</span></span><span style="display:flex;"><span>	<span style="color:#66d9ef">const</span> <span style="color:#a6e22e">value</span> <span style="color:#f92672">=</span> <span style="color:#a6e22e">arr</span>[<span style="color:#a6e22e">i</span>];
</span></span><span style="display:flex;"><span>	<span style="color:#75715e">// do stuff with currentValue
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>}</span></span></code></pre></div><p>This code isn&rsquo;t inherently wrong, but doesn&rsquo;t it seem like a waste to write a traditional <code>for</code> loop just to assign the value to a local constant variable every time? We don&rsquo;t actually need to modify the source collection, we just need to iterate through it and read each value individually.<br>
Enter the <code>foreach</code> loop. This style of loop cuts out that boilerplate step that assigns a local constant in each iteration of your loop. Instead of each loop iteration having an index, each loop iteration will have an item. In JavaScript, this is implemented using the <code>for...of</code> syntax.</p>





<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-javascript" data-lang="javascript"><span style="display:flex;"><span><span style="color:#66d9ef">const</span> <span style="color:#a6e22e">arr</span> <span style="color:#f92672">=</span> [<span style="color:#ae81ff">1</span>, <span style="color:#ae81ff">2</span>, <span style="color:#ae81ff">3</span>, <span style="color:#ae81ff">4</span>, <span style="color:#ae81ff">5</span>];
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">for</span> (<span style="color:#66d9ef">const</span> <span style="color:#a6e22e">value</span> <span style="color:#66d9ef">of</span> <span style="color:#a6e22e">arr</span>) {
</span></span><span style="display:flex;"><span>	<span style="color:#75715e">// do stuff with value
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>}</span></span></code></pre></div><p>In my opinion, this second option is a lot cleaner. Using this new syntax, we keep our code more concise and focused by demonstrating our intentions with the data through the syntax.<br>
This is a microcosm of what you&rsquo;ll see in the rest of this post; while it&rsquo;s not wrong to use a traditional for loop, we should find ways to be more explicit about our intention when we iterate through a collection.</p>
<h1 id="higher-order-functions">Higher-order Functions</h1>
<p>Rather than mince words to try and explain what a Higher-order Function (hereby referred to as HOF) is, I will simply link its <a href="https://en.wikipedia.org/wiki/Higher-order_function">Wikipedia Article</a>, as well as this <a href="https://eloquentjavascript.net/05_higher_order.html">chapter of Eloquent JavaScript</a> since this article is largely in JavaScript.<br>
Why are HOFs important to the goals of this post? This fundamental construct unlocks a number of elegant ways to use and transform collections of data with more specificity than is generally possible with native looping constructs.<br>
Assuming that you have read the suggested articles or are already familiar with the required anonymous function syntax, let&rsquo;s look at some HOFs that we can use to work with collections of data. While the examples will still be in JavaScript, nearly every modern language has some version of the methods we&rsquo;ll discuss here.</p>
<h1 id="foreach"><code>forEach</code></h1>
<p>We have learned about <code>foreach</code> loops, implemented in JavaScript as <code>for...of</code>. However, there&rsquo;s a HOF to do essentially the same thing. Let&rsquo;s restate the previous example here:</p>





<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-javascript" data-lang="javascript"><span style="display:flex;"><span><span style="color:#66d9ef">const</span> <span style="color:#a6e22e">arr</span> <span style="color:#f92672">=</span> [<span style="color:#ae81ff">1</span>, <span style="color:#ae81ff">2</span>, <span style="color:#ae81ff">3</span>, <span style="color:#ae81ff">4</span>, <span style="color:#ae81ff">5</span>];
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">for</span> (<span style="color:#66d9ef">const</span> <span style="color:#a6e22e">value</span> <span style="color:#66d9ef">of</span> <span style="color:#a6e22e">arr</span>) {
</span></span><span style="display:flex;"><span>	<span style="color:#75715e">// do stuff with value
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>}</span></span></code></pre></div><p>The <code>foreach</code> function takes in an anonymous function with one argument that represents an individual element of the collection. So the above example could be refactored to this:</p>





<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-javascript" data-lang="javascript"><span style="display:flex;"><span><span style="color:#66d9ef">const</span> <span style="color:#a6e22e">arr</span> <span style="color:#f92672">=</span> [<span style="color:#ae81ff">1</span>, <span style="color:#ae81ff">2</span>, <span style="color:#ae81ff">3</span>, <span style="color:#ae81ff">4</span>, <span style="color:#ae81ff">5</span>];
</span></span><span style="display:flex;"><span><span style="color:#a6e22e">arr</span>.<span style="color:#a6e22e">forEach</span>(
</span></span><span style="display:flex;"><span>	<span style="color:#a6e22e">value</span> =&gt; {
</span></span><span style="display:flex;"><span>		<span style="color:#75715e">// do stuff with value
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>	}
</span></span><span style="display:flex;"><span>);</span></span></code></pre></div><p>When I see this function, I think &ldquo;We are going to perform an operation that reads each element of the array individually&rdquo; and I can focus on how we will use each element.<br>
I started with the <code>forEach</code> function because it&rsquo;s a great introduction to HOFs in general. However, to someone not sold on HOFs as a concept, this might look no better than a <code>for...of</code> loop. In truth, the justification for this goes deeper into the concept of immutability and side effects that are core to the Functional Programming Manifesto. (I capitalized that like it was a real book, but unfortunately it&rsquo;s not. Lots of great reading if you search that exact phrase, though.)</p>
<p>In the interest of staying in scope for this post, let&rsquo;s instead move ahead to some similar HOFs that I believe provide a clear new advantage.</p>
<h1 id="map"><code>map</code></h1>
<p><code>map</code> is designed to handle the scenario where we want to apply a transformation to every element of an array. For example you may have a loop that wants to build the 2&rsquo;s timestable.</p>





<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-javascript" data-lang="javascript"><span style="display:flex;"><span><span style="color:#66d9ef">const</span> <span style="color:#a6e22e">arr</span> <span style="color:#f92672">=</span> [<span style="color:#ae81ff">1</span>, <span style="color:#ae81ff">2</span>, <span style="color:#ae81ff">3</span>, <span style="color:#ae81ff">4</span>, <span style="color:#ae81ff">5</span>];
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">const</span> <span style="color:#a6e22e">twoTimestable</span> <span style="color:#f92672">=</span> [];
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">for</span> (<span style="color:#66d9ef">const</span> <span style="color:#a6e22e">value</span> <span style="color:#66d9ef">of</span> <span style="color:#a6e22e">arr</span>) {
</span></span><span style="display:flex;"><span>	<span style="color:#a6e22e">twoTimestable</span>.<span style="color:#a6e22e">push</span>(<span style="color:#a6e22e">value</span> <span style="color:#f92672">*</span> <span style="color:#ae81ff">2</span>);
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span><span style="color:#75715e">// twoTimestable = [2, 4, 6, 8, 10]
</span></span></span></code></pre></div><p>When another coder reads this, they will be able to tell that this is a loop to build a new collection of data based on each element of a source. Using the <code>map</code> function, we can instead specify that a new collection is a result of transforming the source&rsquo;s elements individually.</p>





<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-javascript" data-lang="javascript"><span style="display:flex;"><span><span style="color:#66d9ef">const</span> <span style="color:#a6e22e">arr</span> <span style="color:#f92672">=</span> [<span style="color:#ae81ff">1</span>, <span style="color:#ae81ff">2</span>, <span style="color:#ae81ff">3</span>, <span style="color:#ae81ff">4</span>, <span style="color:#ae81ff">5</span>];
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">const</span> <span style="color:#a6e22e">twoTimestable</span> <span style="color:#f92672">=</span> <span style="color:#a6e22e">arr</span>.<span style="color:#a6e22e">map</span>(<span style="color:#a6e22e">value</span> =&gt; <span style="color:#a6e22e">value</span> <span style="color:#f92672">*</span> <span style="color:#ae81ff">2</span>);
</span></span><span style="display:flex;"><span><span style="color:#75715e">// twoTimestable = [2, 4, 6, 8, 10]
</span></span></span></code></pre></div><p>When I read the second example, I see the <code>map</code> function and instantly think &ldquo;This is a new collection that is a transformation of the original&rdquo;, and I can focus on what exactly the transformation is. That&rsquo;s the important part anyway; the extra code that manages assigning the results to a new collection is simply boilerplate around what I would consider the unique behaviour of the program. It&rsquo;s what makes this program special, if you will.</p>
<h1 id="filter"><code>filter</code></h1>
<p><code>filter</code> is for when we need specific data out of a collection. Let&rsquo;s say we want an array containing only the elements of our source that are divisible by 3.</p>





<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-javascript" data-lang="javascript"><span style="display:flex;"><span><span style="color:#66d9ef">const</span> <span style="color:#a6e22e">arr</span> <span style="color:#f92672">=</span> [<span style="color:#ae81ff">1</span>, <span style="color:#ae81ff">3</span>, <span style="color:#ae81ff">6</span>, <span style="color:#ae81ff">8</span>, <span style="color:#ae81ff">12</span>];
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">const</span> <span style="color:#a6e22e">divisibleBy3</span> <span style="color:#f92672">=</span> [];
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">for</span> (<span style="color:#66d9ef">const</span> <span style="color:#a6e22e">value</span> <span style="color:#66d9ef">of</span> <span style="color:#a6e22e">arr</span>) {
</span></span><span style="display:flex;"><span>	<span style="color:#66d9ef">if</span> (<span style="color:#a6e22e">value</span> <span style="color:#f92672">%</span> <span style="color:#ae81ff">3</span> <span style="color:#f92672">===</span> <span style="color:#ae81ff">0</span>) {
</span></span><span style="display:flex;"><span>		<span style="color:#a6e22e">divisibleBy3</span>.<span style="color:#a6e22e">push</span>(<span style="color:#a6e22e">value</span> <span style="color:#f92672">*</span> <span style="color:#ae81ff">2</span>);
</span></span><span style="display:flex;"><span>	}
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span><span style="color:#75715e">// divisibleBy3 = [3, 6, 12]
</span></span></span></code></pre></div><p><code>filter</code> will give us similar benefits to <code>map</code> here. <code>filter</code> is a function that will produce a new collection that only contains the elements of source for which the specified function returns <code>true</code>. So the above example can be refactored to this:</p>





<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-javascript" data-lang="javascript"><span style="display:flex;"><span><span style="color:#66d9ef">const</span> <span style="color:#a6e22e">arr</span> <span style="color:#f92672">=</span> [<span style="color:#ae81ff">1</span>, <span style="color:#ae81ff">3</span>, <span style="color:#ae81ff">6</span>, <span style="color:#ae81ff">8</span>, <span style="color:#ae81ff">12</span>];
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">const</span> <span style="color:#a6e22e">divisibleBy3</span> <span style="color:#f92672">=</span> <span style="color:#a6e22e">arr</span>.<span style="color:#a6e22e">filter</span>(<span style="color:#a6e22e">value</span> =&gt; <span style="color:#a6e22e">value</span> <span style="color:#f92672">%</span> <span style="color:#ae81ff">3</span> <span style="color:#f92672">===</span> <span style="color:#ae81ff">0</span>);
</span></span><span style="display:flex;"><span><span style="color:#75715e">// divisibleBy3 = [3, 6, 12]
</span></span></span></code></pre></div><p>When I see <code>filter</code>, I think &ldquo;This will be a new collection of data that passes some criteria&rdquo;, and then I can focus on the criteria. As with <code>map</code>, that&rsquo;s what makes this program special.<br>
In my eyes, this seems like the easiest HOF to sell. It is in my opinion the most intuitive because it&rsquo;s the word we would probably use in plain English to describe what we are actually trying to do.</p>
<h1 id="reduce-traditionally-known-as-fold"><code>reduce</code> (traditionally known as <code>fold</code>)</h1>
<p>This one may be the hardest to sell of the 4 HOFs we&rsquo;re exploring.<br>
<code>reduce</code> is used for when we want to take the elements of a collection and deduce some final result from it. A good basic example would be calculating the sum of all elements in an integer array.</p>





<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-javascript" data-lang="javascript"><span style="display:flex;"><span><span style="color:#66d9ef">const</span> <span style="color:#a6e22e">arr</span> <span style="color:#f92672">=</span> [<span style="color:#ae81ff">1</span>, <span style="color:#ae81ff">2</span>, <span style="color:#ae81ff">3</span>, <span style="color:#ae81ff">4</span>, <span style="color:#ae81ff">5</span>];
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">let</span> <span style="color:#a6e22e">sum</span> <span style="color:#f92672">=</span> <span style="color:#ae81ff">0</span>;
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">for</span> (<span style="color:#66d9ef">const</span> <span style="color:#a6e22e">value</span> <span style="color:#66d9ef">of</span> <span style="color:#a6e22e">arr</span>) {
</span></span><span style="display:flex;"><span>	<span style="color:#a6e22e">sum</span> <span style="color:#f92672">+=</span> <span style="color:#a6e22e">value</span>;
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span><span style="color:#75715e">// sum = 15 
</span></span></span></code></pre></div><p>While the idea of <code>reduce</code> is simple in explanation, its usage is a bit harder to wrap your head around at first. While the previous HOFs have accepted an anonymous function with a single argument (which represents the &ldquo;current element&rdquo; so to speak), the anonymous function we pass into <code>reduce</code> requires 2: the running value known as the &ldquo;accumulator&rdquo;, and the current element (like the previous examples). This function will then return the new value for the accumulator after whatever action for the current element. In this example, the &ldquo;accumulator&rdquo; will be the sum we&rsquo;re calculating. We&rsquo;ll seed the accumulator with some value (in this case 0) as the second argument to the outer <code>reduce</code> function.</p>





<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-javascript" data-lang="javascript"><span style="display:flex;"><span><span style="color:#66d9ef">const</span> <span style="color:#a6e22e">arr</span> <span style="color:#f92672">=</span> [<span style="color:#ae81ff">1</span>, <span style="color:#ae81ff">2</span>, <span style="color:#ae81ff">3</span>, <span style="color:#ae81ff">4</span>, <span style="color:#ae81ff">5</span>];
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">const</span> <span style="color:#a6e22e">sum</span> <span style="color:#f92672">=</span> <span style="color:#a6e22e">arr</span>.<span style="color:#a6e22e">reduce</span>(
</span></span><span style="display:flex;"><span>	(<span style="color:#a6e22e">sum</span>, <span style="color:#a6e22e">value</span>) =&gt; <span style="color:#a6e22e">sum</span> <span style="color:#f92672">+</span> <span style="color:#a6e22e">value</span>,
</span></span><span style="display:flex;"><span>	<span style="color:#ae81ff">0</span>
</span></span><span style="display:flex;"><span>);
</span></span><span style="display:flex;"><span><span style="color:#75715e">// sum = 15 
</span></span></span></code></pre></div><p>Building a result that combines all the elements in a collection is a great use of <code>reduce</code>, but it can also be good for finding an element in a collection based on some criteria relative to the other elements in a collection. For example, if we wanted to find the max element in an array with a traditional for loop it would look something like this:</p>





<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-javascript" data-lang="javascript"><span style="display:flex;"><span><span style="color:#66d9ef">const</span> <span style="color:#a6e22e">arr</span> <span style="color:#f92672">=</span> [<span style="color:#ae81ff">1</span>, <span style="color:#ae81ff">2</span>, <span style="color:#ae81ff">5</span>, <span style="color:#ae81ff">4</span>, <span style="color:#ae81ff">3</span>];
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">let</span> <span style="color:#a6e22e">max</span> <span style="color:#f92672">=</span> Number.<span style="color:#a6e22e">MIN_VALUE</span>;
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">for</span> (<span style="color:#66d9ef">const</span> <span style="color:#a6e22e">value</span> <span style="color:#66d9ef">of</span> <span style="color:#a6e22e">arr</span>) {
</span></span><span style="display:flex;"><span>	<span style="color:#66d9ef">if</span> (<span style="color:#a6e22e">value</span> <span style="color:#f92672">&gt;</span> <span style="color:#a6e22e">max</span>) {
</span></span><span style="display:flex;"><span>		<span style="color:#a6e22e">max</span> <span style="color:#f92672">=</span> <span style="color:#a6e22e">value</span>;
</span></span><span style="display:flex;"><span>	}
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span><span style="color:#75715e">// max = 5
</span></span></span></code></pre></div><p>The name &ldquo;accumulator&rdquo; becomes a slight misnomer in this scenario, because rather than being an accumulation of all the values, it is simply the end result we are interested in. Ignoring that, the <code>reduce</code> we write is pretty similar to the earlier example:</p>





<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-javascript" data-lang="javascript"><span style="display:flex;"><span><span style="color:#66d9ef">const</span> <span style="color:#a6e22e">arr</span> <span style="color:#f92672">=</span> [<span style="color:#ae81ff">1</span>, <span style="color:#ae81ff">2</span>, <span style="color:#ae81ff">5</span>, <span style="color:#ae81ff">4</span>, <span style="color:#ae81ff">3</span>];
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">const</span> <span style="color:#a6e22e">max</span> <span style="color:#f92672">=</span> <span style="color:#a6e22e">arr</span>.<span style="color:#a6e22e">reduce</span>(
</span></span><span style="display:flex;"><span>	(<span style="color:#a6e22e">max</span>, <span style="color:#a6e22e">value</span>) =&gt; {
</span></span><span style="display:flex;"><span>		<span style="color:#66d9ef">if</span> (<span style="color:#a6e22e">value</span> <span style="color:#f92672">&gt;</span> <span style="color:#a6e22e">max</span>) {
</span></span><span style="display:flex;"><span>			<span style="color:#66d9ef">return</span> <span style="color:#a6e22e">value</span>;
</span></span><span style="display:flex;"><span>		}
</span></span><span style="display:flex;"><span>		<span style="color:#66d9ef">return</span> <span style="color:#a6e22e">max</span>;
</span></span><span style="display:flex;"><span>	}
</span></span><span style="display:flex;"><span>	Number.<span style="color:#a6e22e">MIN_VALUE</span>
</span></span><span style="display:flex;"><span>);
</span></span><span style="display:flex;"><span><span style="color:#75715e">// max = 5 
</span></span></span></code></pre></div><p>You might be thinking &ldquo;you silly goose, this is more lines than the original <code>for</code>!&rdquo;<br>
Correct, I have bamboozled you to demonstrate a common gotcha for writing these HOF argument functions; they do need to return a value. The way we&rsquo;ve been writing them (without braces) implies that the calculation is the return value of the function. However if you go to write your first reduce and wonder why the heck it&rsquo;s not working, the first check is to ensure that all code paths are returning a value.<br>
This example can be written as a one-liner using a ternary expression:</p>





<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-javascript" data-lang="javascript"><span style="display:flex;"><span><span style="color:#66d9ef">const</span> <span style="color:#a6e22e">arr</span> <span style="color:#f92672">=</span> [<span style="color:#ae81ff">1</span>, <span style="color:#ae81ff">2</span>, <span style="color:#ae81ff">5</span>, <span style="color:#ae81ff">4</span>, <span style="color:#ae81ff">3</span>];
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">const</span> <span style="color:#a6e22e">max</span> <span style="color:#f92672">=</span> <span style="color:#a6e22e">arr</span>.<span style="color:#a6e22e">reduce</span>(
</span></span><span style="display:flex;"><span>	(<span style="color:#a6e22e">max</span>, <span style="color:#a6e22e">value</span>) =&gt; <span style="color:#a6e22e">value</span> <span style="color:#f92672">&gt;</span> <span style="color:#a6e22e">max</span> <span style="color:#f92672">?</span> <span style="color:#a6e22e">value</span> <span style="color:#f92672">:</span> <span style="color:#a6e22e">max</span>,
</span></span><span style="display:flex;"><span>	Number.<span style="color:#a6e22e">MIN_VALUE</span>
</span></span><span style="display:flex;"><span>);
</span></span><span style="display:flex;"><span><span style="color:#75715e">// max = 5 
</span></span></span></code></pre></div><p>When I see a <code>reduce</code>, I think &ldquo;This will take the source collection and build some kind of result out of it&rdquo;, and I can focus on what it needs to do to find that result.</p>
<h1 id="this-is-so-sad-i-love-for-loops-there-must-be-some-use-for-them">This is so sad, I love <code>for</code> loops. There must be some use for them!</h1>
<p>Fear not! HOFs are awesome, and in a pure functional language like Haskell, you would only be using them at all times. However, if you are not living in the Pure Functional Utopia, there are some still some great uses for traditional <code>for</code> loops.</p>
<h1 id="modifying-the-source-collection">Modifying the source collection</h1>
<p>This post has so far assumed that you are only <em>reading</em> from the source collection and producing a new result. I always strive not to modify values in code; it&rsquo;s so nice to always know with certainty what everything in your program is going to contain/equal. However, if for whatever reason you are required to modify a collection in place, a traditional index <code>for</code> loop is still the best way to cleanly do so.</p>





<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-javascript" data-lang="javascript"><span style="display:flex;"><span><span style="color:#66d9ef">const</span> <span style="color:#a6e22e">arr</span> <span style="color:#f92672">=</span> [<span style="color:#ae81ff">1</span>, <span style="color:#ae81ff">2</span>, <span style="color:#ae81ff">3</span>];
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">for</span> (<span style="color:#66d9ef">let</span> <span style="color:#a6e22e">i</span> <span style="color:#f92672">=</span> <span style="color:#ae81ff">0</span>; <span style="color:#a6e22e">i</span> <span style="color:#f92672">&lt;</span> <span style="color:#a6e22e">arr</span>.<span style="color:#a6e22e">length</span>; <span style="color:#a6e22e">i</span><span style="color:#f92672">++</span>) {
</span></span><span style="display:flex;"><span>	<span style="color:#a6e22e">arr</span>[<span style="color:#a6e22e">i</span>] <span style="color:#f92672">=</span> <span style="color:#ae81ff">69</span>;
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span><span style="color:#75715e">// arr = [69, 69, 69];
</span></span></span></code></pre></div><h1 id="comparing-elements-directly-to-previous-or-future-elements">Comparing elements directly to previous or future elements</h1>
<p>If you are in a scenario where you need to take different action based on elements before/after the current element, the traditional index <code>for</code> loop is going to be your best bet.</p>





<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-javascript" data-lang="javascript"><span style="display:flex;"><span><span style="color:#66d9ef">const</span> <span style="color:#a6e22e">arr</span> <span style="color:#f92672">=</span> [<span style="color:#ae81ff">1</span>, <span style="color:#ae81ff">1</span>, <span style="color:#ae81ff">1</span>, <span style="color:#ae81ff">2</span>, <span style="color:#ae81ff">2</span>, <span style="color:#ae81ff">3</span>];
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">for</span> (<span style="color:#66d9ef">let</span> <span style="color:#a6e22e">i</span> <span style="color:#f92672">=</span> <span style="color:#ae81ff">0</span>; <span style="color:#a6e22e">i</span> <span style="color:#f92672">&lt;</span> <span style="color:#a6e22e">arr</span>.<span style="color:#a6e22e">length</span>; <span style="color:#a6e22e">i</span><span style="color:#f92672">++</span>) {
</span></span><span style="display:flex;"><span>	<span style="color:#66d9ef">if</span> (<span style="color:#a6e22e">arr</span>[<span style="color:#a6e22e">i</span>] <span style="color:#f92672">==</span> <span style="color:#a6e22e">arr</span>[<span style="color:#a6e22e">i</span> <span style="color:#f92672">-</span> <span style="color:#ae81ff">1</span>]) {
</span></span><span style="display:flex;"><span>		<span style="color:#75715e">// do something 
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>	} <span style="color:#66d9ef">else</span> {
</span></span><span style="display:flex;"><span>		<span style="color:#75715e">// do something different 
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>	}
</span></span><span style="display:flex;"><span>}</span></span></code></pre></div><h1 id="you-are-writing-go">You are writing Go</h1>
<p>Let&rsquo;s take a brief intermission from JavaScript. To present a new perspective.<br>
If you&rsquo;re writing Go, you can probably throw everything I&rsquo;ve said in this post out the window. The way Go handles loops is essentially the antithesis to what I&rsquo;ve presented so far.<br>
Not only is a <code>for</code> loop the only way to iterate through a collection of data, the <code>for</code> keyword even replaced the <code>while</code> keyword. This is in service of Go&rsquo;s language design philosophy, which is (in brief) to standardize around one way to do things as much as possible (a philosophy I&rsquo;d love to rant on in a future post).<br>
Go does have a <code>func</code> type, meaning one could implement their own <code>map</code> function like so:</p>





<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-go" data-lang="go"><span style="display:flex;"><span><span style="color:#66d9ef">func</span> <span style="color:#a6e22e">Map</span>(<span style="color:#a6e22e">arr</span> []<span style="color:#66d9ef">int</span>, <span style="color:#a6e22e">transform</span> <span style="color:#66d9ef">func</span>(<span style="color:#66d9ef">int</span>) <span style="color:#66d9ef">int</span>) []<span style="color:#66d9ef">int</span> {
</span></span><span style="display:flex;"><span>	<span style="color:#a6e22e">result</span> <span style="color:#f92672">:=</span> make([]<span style="color:#66d9ef">int</span>, len(<span style="color:#a6e22e">arr</span>))
</span></span><span style="display:flex;"><span>	<span style="color:#66d9ef">for</span> <span style="color:#a6e22e">i</span>, <span style="color:#a6e22e">value</span> <span style="color:#f92672">:=</span> <span style="color:#66d9ef">range</span> <span style="color:#a6e22e">arr</span> {
</span></span><span style="display:flex;"><span>		<span style="color:#a6e22e">result</span>[<span style="color:#a6e22e">i</span>] = <span style="color:#a6e22e">transform</span>(<span style="color:#a6e22e">value</span>)
</span></span><span style="display:flex;"><span>	}
</span></span><span style="display:flex;"><span>	<span style="color:#66d9ef">return</span> <span style="color:#a6e22e">result</span>
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">func</span> <span style="color:#a6e22e">main</span>() {
</span></span><span style="display:flex;"><span>	<span style="color:#a6e22e">arr</span> <span style="color:#f92672">:=</span> []<span style="color:#66d9ef">int</span>{<span style="color:#ae81ff">1</span>, <span style="color:#ae81ff">2</span>, <span style="color:#ae81ff">3</span>, <span style="color:#ae81ff">4</span>, <span style="color:#ae81ff">5</span>}
</span></span><span style="display:flex;"><span>	<span style="color:#a6e22e">twoTimesTable</span> <span style="color:#f92672">:=</span> <span style="color:#a6e22e">Map</span>(
</span></span><span style="display:flex;"><span>		<span style="color:#a6e22e">arr</span>,
</span></span><span style="display:flex;"><span>		<span style="color:#66d9ef">func</span>(<span style="color:#a6e22e">value</span> <span style="color:#66d9ef">int</span>) <span style="color:#66d9ef">int</span> { <span style="color:#66d9ef">return</span> <span style="color:#a6e22e">value</span> <span style="color:#f92672">*</span> <span style="color:#ae81ff">2</span> },
</span></span><span style="display:flex;"><span>	)
</span></span><span style="display:flex;"><span>	<span style="color:#75715e">// twoTimesTable = [2, 4, 6, 8, 10]</span>
</span></span><span style="display:flex;"><span>}</span></span></code></pre></div><p>(I wrote this myself, but would not have been able to without <a href="https://yourbasic.org/golang/function-pointer-type-declaration/">this post</a> from Algorithms to Go.)</p>
<p>The beauty and curse of Go is that it&rsquo;s up the developer to implement this if they want it. I don&rsquo;t know any Go developers personally, but I imagine a some would turn their nose at this while others would welcome it. If you happen to be a Go developer, I would love to hear your thoughts on this!</p>
<h1 id="you-are-writing-c">You are writing C</h1>
<p>Just use <code>for</code> loops.</p>
<h1 id="conclusion">Conclusion</h1>
<p>You may read this post and think to yourself &ldquo;this guy is stupid, it&rsquo;s more readable to just use a for loop&rdquo;.<br>
You might be right given the context of the code you&rsquo;re writing. &ldquo;Readibility&rdquo; is pretty subjective, and some people may prefer to see the boilerplate that comes with the <code>for</code> loop examples I presented here. My goal with this post was to present my perspective; I love the way HOFs describe explicit intentions for an iteration through a collection, allowing me to focus on the part of the code that matters.</p>
<p>I hope that you enjoyed reading this post! This is my first public blog post and I&rsquo;m pretty nervous to show it to the world, but I really hope you gained something out of it and did not leave in an unbridled rage. Please follow my socials if you enjoyed this post and want to read more of my ramblings!</p>
]]></content:encoded>
    </item>
    <item>
      <title>Projects</title>
      <link>https://blog.ragecage64.com/projects/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      <guid>https://blog.ragecage64.com/projects/</guid>
      <description>&lt;p&gt;Here is a quick list of my personal projects, both previous and active! Most of them are MIT licensed, with a couple exceptions.&lt;/p&gt;&#xA;&lt;h1 id=&#34;tools-and-libraries&#34;&gt;Tools and Libraries&lt;/h1&gt;&#xA;&lt;hr&gt;&#xA;&lt;h3 id=&#34;yamlfmt-httpsgithubcomgoogleyamlfmt&#34;&gt;yamlfmt &lt;a href=&#34;https://github.com/google/yamlfmt&#34;&gt;https://github.com/google/yamlfmt&lt;/a&gt;&lt;/h3&gt;&#xA;&lt;h4 id=&#34;language-go&#34;&gt;Language: Go&lt;/h4&gt;&#xA;&lt;p&gt;A command line yaml formatting tool, also structured as a library for extensibility or custom wrappers.&lt;/p&gt;&#xA;&lt;p&gt;This is my largest open source success. The tool has over 1k GitHub Stars, and each release &lt;a href=&#34;https://tooomm.github.io/github-release-stats/?username=google&amp;amp;repository=yamlfmt&#34;&gt;gets tens-to-hundreds of thousands of downloads&lt;/a&gt;.&lt;/p&gt;&#xA;&lt;h3 id=&#34;go-utf8-codepoint-converter-httpsgithubcomragecage64go-utf8-codepoint-converter&#34;&gt;go-utf8-codepoint-converter &lt;a href=&#34;https://github.com/RageCage64/go-utf8-codepoint-converter&#34;&gt;https://github.com/RageCage64/go-utf8-codepoint-converter&lt;/a&gt;&lt;/h3&gt;&#xA;&lt;h4 id=&#34;language-go-1&#34;&gt;Language: Go&lt;/h4&gt;&#xA;&lt;p&gt;Tool to convert UTF-8 codepoint text to the unicode character the text represents.&lt;/p&gt;</description>
      <content:encoded><![CDATA[<p>Here is a quick list of my personal projects, both previous and active! Most of them are MIT licensed, with a couple exceptions.</p>
<h1 id="tools-and-libraries">Tools and Libraries</h1>
<hr>
<h3 id="yamlfmt-httpsgithubcomgoogleyamlfmt">yamlfmt <a href="https://github.com/google/yamlfmt">https://github.com/google/yamlfmt</a></h3>
<h4 id="language-go">Language: Go</h4>
<p>A command line yaml formatting tool, also structured as a library for extensibility or custom wrappers.</p>
<p>This is my largest open source success. The tool has over 1k GitHub Stars, and each release <a href="https://tooomm.github.io/github-release-stats/?username=google&amp;repository=yamlfmt">gets tens-to-hundreds of thousands of downloads</a>.</p>
<h3 id="go-utf8-codepoint-converter-httpsgithubcomragecage64go-utf8-codepoint-converter">go-utf8-codepoint-converter <a href="https://github.com/RageCage64/go-utf8-codepoint-converter">https://github.com/RageCage64/go-utf8-codepoint-converter</a></h3>
<h4 id="language-go-1">Language: Go</h4>
<p>Tool to convert UTF-8 codepoint text to the unicode character the text represents.</p>
<h3 id="fluent-bit-lua-tester-httpsgithubcomragecage64flb_lua_tester">Fluent Bit Lua Tester <a href="https://github.com/RageCage64/flb_lua_tester">https://github.com/RageCage64/flb_lua_tester</a></h3>
<h4 id="language-rust">Language: Rust</h4>
<p>Allows you to run Lua scripts meant for Fluent Bit scripting in a sanitized environment with specific input and expected output.</p>
<h3 id="collections-go-httpsgitragecage64comragecage64collections-go">collections-go <a href="https://git.ragecage64.com/RageCage64/collections-go">https://git.ragecage64.com/RageCage64/collections-go</a></h3>
<h4 id="language-go-2">Language: Go</h4>
<p>A library that implements common data structures for Go with best possible time complexity and minimal allocations.</p>
<h3 id="multilinediff-httpsgitragecage64comragecage64multilinediff">multilinediff <a href="https://git.ragecage64.com/RageCage64/multilinediff">https://git.ragecage64.com/RageCage64/multilinediff</a></h3>
<h4 id="language-go-3">Language: Go</h4>
<p>A library to write multiline diff output to the command line.</p>
<h1 id="open-source-work">Open Source Work</h1>
<hr>
<p>Most of my open source work is done under my Google Github profile: <a href="https://github.com/braydonk">https://github.com/braydonk</a></p>
<h3 id="opentelemetry-httpsgithubcomopen-telemetry">OpenTelemetry <a href="https://github.com/open-telemetry">https://github.com/open-telemetry</a></h3>
<h4 id="language-go-4">Language: Go</h4>
<p>Contributing to OpenTelemetry in a couple of ways:</p>
<ul>
<li>Codeowner of the <a href="https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/hostmetricsreceiver">hostmetricsreceiver</a> in the OpenTelemetry Collector</li>
<li>Member of the <a href="https://github.com/open-telemetry/semantic-conventions/blob/main/docs/non-normative/groups/system/design-philosophy.md">System Semantic Conventions</a> Working Group</li>
</ul>
<h3 id="fluent-bit-httpsgithubcomfluentfluent-bit">Fluent Bit <a href="https://github.com/fluent/fluent-bit">https://github.com/fluent/fluent-bit</a></h3>
<h4 id="language-c">Language: C</h4>
<p>An open source observability agent, which we use on my team at Google as part of the Ops Agent. I help fix a number of bugs in Fluent Bit, as well as doing code reviews and maintenance on the <code>out_stackdriver</code> plugin.</p>
<h3 id="monkey-httpsgithubcommonkeymonkey">Monkey <a href="https://github.com/monkey/monkey">https://github.com/monkey/monkey</a></h3>
<h4 id="language-c-1">Language: C</h4>
<p>An HTTP server written in C. It is a crucial component of Fluent Bit, and I have done some work on this repo to support fixes in Fluent Bit, as well as adding unit tests to the repo.</p>
<h1 id="youtube">YouTube</h1>
<hr>
<p>I do have a YouTube channel at <a href="https://www.youtube.com/@RageCageCodes-ik2ue">https://www.youtube.com/@RageCageCodes-ik2ue</a>. I only have one video as of writing, I was thinking I might make more but I found making tutorial content not as exciting as I&rsquo;d hope. I&rsquo;m keeping it on the backburner just in case!</p>
<h1 id="gaming">Gaming</h1>
<hr>
<h3 id="trustfall-httpsgitragecage64comragecage64trustfall">TrustFall <a href="https://git.ragecage64.com/RageCage64/TrustFall">https://git.ragecage64.com/RageCage64/TrustFall</a></h3>
<h4 id="language-c-2">Language: C++</h4>
<p>A Root Beer Tapper ripoff that I wrote as a school project. Uses Allegro 5 because I had to (well technically I had to use 4 but I refused to do that and accepted the consequences).</p>
<h3 id="spaceforce-httpsgitragecage64comragecage64spaceforce">SpaceForce <a href="https://git.ragecage64.com/RageCage64/SpaceForce">https://git.ragecage64.com/RageCage64/SpaceForce</a></h3>
<h4 id="language-c-3">Language: C++</h4>
<p>A SHMUP that I wrote also for a school project.</p>
<h3 id="seenoevil-httpsgitragecage64comragecage64seenoevil">SeeNoEvil <a href="https://git.ragecage64.com/RageCage64/SeeNoEvil">https://git.ragecage64.com/RageCage64/SeeNoEvil</a></h3>
<h4 id="language-c-4">Language: C#</h4>
<p>My entry to the 8-bits-to-infinity game jam. I sadly did not save the assets, which is too bad but my partner who drew them insists they weren&rsquo;t worth keeping. I thought they looked pretty good. :D<br>
The main thing I want to extract out of this is the code that worked with <a href="https://www.mapeditor.org/">Tiled</a>, I thought it was reasonably sophisticated for something I coded in under a week. Would be cool to extract it into a standalone library.</p>
]]></content:encoded>
    </item>
    <item>
      <title>Talks</title>
      <link>https://blog.ragecage64.com/talks/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      <guid>https://blog.ragecage64.com/talks/</guid>
      <description>&lt;p&gt;This is a collection of public recorded talks I&amp;rsquo;ve done.&lt;/p&gt;&#xA;&lt;h1 id=&#34;prepared-talks&#34;&gt;Prepared Talks&lt;/h1&gt;&#xA;&lt;h2 id=&#34;deep-dive-how-fluent-bit-collects-file-logs&#34;&gt;Deep Dive: How Fluent Bit Collects File Logs&lt;/h2&gt;&#xA;&lt;p&gt;&lt;a href=&#34;https://www.youtube.com/watch?v=KrlvWBCGagI&#34;&gt;https://www.youtube.com/watch?v=KrlvWBCGagI&lt;/a&gt;&lt;/p&gt;&#xA;&lt;p&gt;This is my talk for Observability Day North America, a co-located event with KubeCon NA 2024. It was a lightning talk, but it ended up being a really dense talk and probably could have been full-sized. To compensate, I talked really fast!&lt;/p&gt;&#xA;&lt;p&gt;T-shirt: Iron Maiden&lt;/p&gt;&#xA;&lt;h2 id=&#34;tuning-otel-collector-performance-through-profiling&#34;&gt;Tuning OTel Collector Performance Through Profiling&lt;/h2&gt;&#xA;&lt;p&gt;&lt;a href=&#34;https://www.youtube.com/watch?v=qMxxjB4meXo&#34;&gt;https://www.youtube.com/watch?v=qMxxjB4meXo&lt;/a&gt;&lt;/p&gt;</description>
      <content:encoded><![CDATA[<p>This is a collection of public recorded talks I&rsquo;ve done.</p>
<h1 id="prepared-talks">Prepared Talks</h1>
<h2 id="deep-dive-how-fluent-bit-collects-file-logs">Deep Dive: How Fluent Bit Collects File Logs</h2>
<p><a href="https://www.youtube.com/watch?v=KrlvWBCGagI">https://www.youtube.com/watch?v=KrlvWBCGagI</a></p>
<p>This is my talk for Observability Day North America, a co-located event with KubeCon NA 2024. It was a lightning talk, but it ended up being a really dense talk and probably could have been full-sized. To compensate, I talked really fast!</p>
<p>T-shirt: Iron Maiden</p>
<h2 id="tuning-otel-collector-performance-through-profiling">Tuning OTel Collector Performance Through Profiling</h2>
<p><a href="https://www.youtube.com/watch?v=qMxxjB4meXo">https://www.youtube.com/watch?v=qMxxjB4meXo</a></p>
<p>This was a talk for OpenTelemetry Community Day 2024. It goes through my experience profiling parts of the OpenTelemetry Collector to find performance improvements.</p>
<p>Retractions: One of the solutions I talked about in this talk for Windows getting Parent Process ID had a flawed premise, ignoring the fact that the increase in WMI memory usage did offset the gains made in the Collector. So it ended up not being that big of a win, and we&rsquo;re still working to find another alternate method for getting Parent Process ID.</p>
<p>T-shirt: Brook from One Piece Wanted Poster</p>
<h2 id="how-much-overhead-how-to-evaluate-observability-agent-performance">How Much Overhead: How to Evaluate Observability Agent Performance</h2>
<p><a href="https://www.youtube.com/watch?v=BIaftvtFPHg">https://www.youtube.com/watch?v=BIaftvtFPHg</a></p>
<p>This is my talk for Observability Day 2023, a co-located event with KubeCon NA 2023. It was inspired by situations at work where people would ask things like &ldquo;which agent has less overhead?&rdquo; without fully qualifying their goals. I wanted to break down the problem down into more actionable pieces.</p>
<p>T-shirt: Meshuggah Catch-33</p>
<h2 id="learning-to-fly-how-to-find-bottlenecks-in-your-agents">Learning To Fly: How to Find Bottlenecks in your Agents</h2>
<p><a href="https://www.youtube.com/watch?v=jf7t1CpoKlg&amp;t=176s">https://www.youtube.com/watch?v=jf7t1CpoKlg&amp;t=176s</a></p>
<p>This was a remote talk I did for the <a href="https://www.youtube.com/@isitobservable">Is It Observable</a> YouTube channel (awesome channel, highly recommend subscribing). This was perhaps the hardest I ever prepared for a talk, because it came with an in-depth <a href="https://github.com/braydonk/learning-to-fly-lightning-talk">reproducible demo</a>, that ran in a Dockerfile and included code to graph OpenTelemetry Metrics directly in the CLI. It was a lot of fun to prepare and I think it&rsquo;s one of my best talks. If you can get around the fact that my mic sounded TERRIBLE).</p>
<p>T-shirt: Zoro from One Piece</p>
<p>Background friend: A plush of the character Acrid from my favourite video game, Risk of Rain 2</p>
<h1 id="tutorials">Tutorials</h1>
<h2 id="5-levels-of-go-error-handling">5 Levels of Go Error Handling</h2>
<p><a href="https://www.youtube.com/watch?v=y5utZCeHys0&amp;t=1s">https://www.youtube.com/watch?v=y5utZCeHys0&amp;t=1s</a></p>
<p>This was my one attempt at &ldquo;content creation&rdquo;. It&rsquo;s a relatively beginner-focused tutorial about Go error handling and how to do some more advanced things. The video was picked up by the algorithm this past summer and started getting a lot more attention. I&rsquo;m not completely cutting myself off from making more videos in the future, but I did not have as much fun as I thought I would making this video so I&rsquo;m not sure if I&rsquo;ll make more. I think this tutorial is pretty good for what it is though and I&rsquo;ll keep it around anyway!</p>
<p>T-shirt: It&rsquo;s obscured!</p>
<p>Background friend: Acrid from Risk of Rain 2 again</p>
<h1 id="interviews">Interviews</h1>
<h2 id="kubecon-na-2024-with-is-it-observable">KubeCon NA 2024 with Is It Observable</h2>
<p><a href="https://www.youtube.com/watch?v=qf0OjAEzprs&amp;t=365s">https://www.youtube.com/watch?v=qf0OjAEzprs&amp;t=365s</a></p>
<p>This was an interview with the <a href="https://www.youtube.com/@isitobservable">Is It Observable</a> YouTube channel. I talked a bit about the talk I was giving the next day at Observability Day, as well as some general best practices for managing performance of agents collecting logs.</p>
<p>T-shirt: Coheed and Cambria, Vaxis II tour shirt</p>
<h2 id="humans-of-otel---kubecon-na-2024">Humans of OTel - KubeCon NA 2024</h2>
<p><a href="https://www.youtube.com/watch?v=TIMgKXCeiyQ">https://www.youtube.com/watch?v=TIMgKXCeiyQ</a></p>
<p>I was featured in the Humans of OTel series of interviews at KubeCon NA 2024, along with a lot of amazing peers from the OpenTelemetry Community!</p>
<p>T-shirt: Video filmed too high to tell!</p>
<h2 id="kubecon-na-2023-with-is-it-observable">KubeCon NA 2023 with Is It Observable</h2>
<p><a href="https://www.youtube.com/watch?v=5arixRhAIbs&amp;t=161s">https://www.youtube.com/watch?v=5arixRhAIbs&amp;t=161s</a></p>
<p>An interview with <a href="https://www.youtube.com/@isitobservable">Is It Observable</a> from KubeCon NA 2023. This was my first time doing something like this so I was definitely more nervous, but it was great practice!</p>
<p>T-shirt: Meshuggah Catch 33</p>
]]></content:encoded>
    </item>
  </channel>
</rss>
