Carview!

CARVIEW

MOTORHOMES

Select Language

HTTP/2 200 server: nginx/1.24.0 date: Fri, 16 Jan 2026 00:15:52 GMT content-type: text/xml; charset=utf-8 last-modified: Sat, 13 Dec 2025 07:24:24 GMT vary: Accept-Encoding etag: W/"693d14a8-10f96" expires: Fri, 16 Jan 2026 00:25:52 GMT cache-control: max-age=600 strict-transport-security: max-age=31536000 content-encoding: gzip Roman Cheplyaka https://ro-che.info/articles/ Articles by Roman Cheplyaka StateT vs. IORef: a benchmark Sometimes I’m writing an IO loop in Haskell, and I need some sort of a counter or accumulator. The two main options are to use a mutable reference (IORef) or to put a StateT transformer on top the IO monad. I was curious, though, if there was a difference in efficiency between these two approaches. Intuitively, IORefs are dedicated heap objects, while a StateT transformer’s state becomes “just” a local variable, so StateT might optimize better. But how much of a difference does it make? So I benchmarked the four functions, all of which calculate the sum of numbers between 1 and <code>n = 1000</code>. <code>base_sum</code> simply calls <code>sum</code> from the base package; <code>state_sum</code> and <code>stateT_sum</code> maintain the accumulator using the <code>State Int</code> and <code>StateT Int IO</code> monads, respectively, and <code>ioref_sum</code> uses an <code>IORef</code> within the <code>IO</code> monad. And here are the results, as reported by criterion. <figure> <img src="/img/StateT-vs-IORef.svg" alt="Mean execution times reported by criterion. The error bars are the lower and upper bounds of the mean as reported by criterion, which I think are 95% bootstrap confidence intervals." /> <figcaption aria-hidden="true">Mean execution times reported by criterion. The error bars are the lower and upper bounds of the mean as reported by criterion, which I think are 95% bootstrap confidence intervals.</figcaption> </figure> I’m not sure how <code>stateT_sum</code> manages to be faster than <code>state_sum</code> and <code>base_sum</code> (this doesn’t appear to be a statistical fluke), but what’s clear is that <code>ioref_sum</code> is significantly slower of them all. So if 3ns per state access matter to you, go for <code>StateT</code> even when you are in <code>IO</code>. (Update: also check out the <a href="https://old.reddit.com/r/haskell/comments/knne96/statet_vs_ioref_a_benchmark/">comments on reddit</a>, especially the ones by u/VincentPepper.) Here’s the full benchmark code. It was compiled with <code>-O2</code> by GHC 8.8.4 and run on AMD Ryzen 7 3700X. <div class="sourceCode" id="cb1"><pre class="sourceCode haskell"><code class="sourceCode haskell"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>import Criterion <a href="#cb1-2" aria-hidden="true" tabindex="-1"></a>import Criterion.Main <a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> <a href="#cb1-4" aria-hidden="true" tabindex="-1"></a>import Control.Monad.State <a href="#cb1-5" aria-hidden="true" tabindex="-1"></a>import Data.IORef <a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> <a href="#cb1-7" aria-hidden="true" tabindex="-1"></a>base_sum :: Int -> Int <a href="#cb1-8" aria-hidden="true" tabindex="-1"></a>base_sum n = sum [1 .. n] <a href="#cb1-9" aria-hidden="true" tabindex="-1"></a> <a href="#cb1-10" aria-hidden="true" tabindex="-1"></a>state_sum :: Int -> Int <a href="#cb1-11" aria-hidden="true" tabindex="-1"></a>state_sum n = flip execState 0 $ <a href="#cb1-12" aria-hidden="true" tabindex="-1"></a> forM_ [1..n] $ \i -> <a href="#cb1-13" aria-hidden="true" tabindex="-1"></a> modify' (+i) <a href="#cb1-14" aria-hidden="true" tabindex="-1"></a> <a href="#cb1-15" aria-hidden="true" tabindex="-1"></a>stateT_sum :: Int -> IO Int <a href="#cb1-16" aria-hidden="true" tabindex="-1"></a>stateT_sum n = flip execStateT 0 $ <a href="#cb1-17" aria-hidden="true" tabindex="-1"></a> forM_ [1..n] $ \i -> <a href="#cb1-18" aria-hidden="true" tabindex="-1"></a> modify' (+i) <a href="#cb1-19" aria-hidden="true" tabindex="-1"></a> <a href="#cb1-20" aria-hidden="true" tabindex="-1"></a>ioref_sum :: Int -> IO Int <a href="#cb1-21" aria-hidden="true" tabindex="-1"></a>ioref_sum n = do <a href="#cb1-22" aria-hidden="true" tabindex="-1"></a> ref <- newIORef 0 <a href="#cb1-23" aria-hidden="true" tabindex="-1"></a> forM_ [1..n] $ \i -> <a href="#cb1-24" aria-hidden="true" tabindex="-1"></a> modifyIORef' ref (+i) <a href="#cb1-25" aria-hidden="true" tabindex="-1"></a> readIORef ref <a href="#cb1-26" aria-hidden="true" tabindex="-1"></a> <a href="#cb1-27" aria-hidden="true" tabindex="-1"></a>main = do <a href="#cb1-28" aria-hidden="true" tabindex="-1"></a> let n = 1000 <a href="#cb1-29" aria-hidden="true" tabindex="-1"></a> defaultMain <a href="#cb1-30" aria-hidden="true" tabindex="-1"></a> [ bench "base_sum" $ whnf base_sum n <a href="#cb1-31" aria-hidden="true" tabindex="-1"></a> , bench "state_sum" $ whnf state_sum n <a href="#cb1-32" aria-hidden="true" tabindex="-1"></a> , bench "stateT_sum" $ whnfAppIO stateT_sum n <a href="#cb1-33" aria-hidden="true" tabindex="-1"></a> , bench "ioref_sum" $ whnfAppIO ioref_sum n <a href="#cb1-34" aria-hidden="true" tabindex="-1"></a> ]</code></pre></div> Tue, 29 Dec 2020 20:00:00 +0000 https://ro-che.info/articles/2020-12-29-statet-vs-ioref https://ro-che.info//articles/2020-12-29-statet-vs-ioref.html Laptop vs. desktop for compiling Haskell code I’ve been using various laptops as daily drivers for the last 12 years, and I’ve never felt they were inadequate — until this year. There were a few things that made me put together a desktop PC last month, but a big reason was to improve my Haskell compilation experience on big projects. So let’s test how fast Haskell code compiles on a laptop vs. a desktop. <h2 id="specs">Specs</h2> <table> <thead> <tr> <th></th> <th style="text-align: right;">Laptop</th> <th style="text-align: right;">Desktop</th> </tr> </thead> <tbody> <tr> <td>CPU</td> <td style="text-align: right;">Intel Core i7-6500U</td> <td style="text-align: right;">AMD Ryzen 7 3700X</td> </tr> <tr> <td>Base clock</td> <td style="text-align: right;">2.5 GHz</td> <td style="text-align: right;">3.6 GHz</td> </tr> <tr> <td>Boost clock</td> <td style="text-align: right;">3.1 GHz</td> <td style="text-align: right;">4.4 GHz</td> </tr> <tr> <td>Number of cores</td> <td style="text-align: right;">2</td> <td style="text-align: right;">8</td> </tr> <tr> <td>Memory speed</td> <td style="text-align: right;">2133 MT/s</td> <td style="text-align: right;">4000 MT/s</td> </tr> </tbody> </table> <h2 id="methodology">Methodology</h2> I picked four Haskell packages for this test: pandoc, lens, hledger, and criterion. An individual test consists of building one of these packages or all of them together (represented here by a meta-package called <code>all</code>). The build time includes the time to build all of the transitive dependencies. All sources are pre-downloaded, so just the compilation is timed. The compilation is done using stack (current master with <a href="https://github.com/commercialhaskell/stack/issues/5435#issuecomment-749036479">a custom patch</a>), GHC 8.8.4, and the lts-16.26 Stackage snapshot, with the default flags. The build time of each package (including the <code>all</code> meta-package) is measured 3 times, with all tests happening in a random order. There is a 2 minute break after each build to let the CPU cool down. The CPU frequency governor is set to <code>performance</code> while compiling and to <code>powersave</code> during the cooling breaks. To calculate the average level of parallelism achieved on each package, I divide the user CPU time by the wall-clock time (as reported by GNU time’s <code>%U</code> and <code>%e</code>, respectively), using the data from the desktop benchmark (as it has more potential for parallelism). The full benchmark script is available <a href="/files/2020-12-22-haskell-compilation-laptop-desktop/benchmark-compilation-time.sh">here</a>. I also measured the average power drawn by both computers, both when running the benchmark and in the idle state. As my power meter only reports the instantaneous power and cumulative energy, I measured the cumulative energy (in W⋅h) at several random time points and fitted an ordinary least squares linear regression to find the average power. <h2 id="results">Results</h2> The first result is that I had to take the laptop outside the house (0°C) to even be able to finish this benchmark; otherwise the computer would overheat and shut down. While the laptop was outside, the CPU temperature would rise up to 74°C. The desktop, on the other hand, had no issue keeping itself cool (< 60°C) under the room temperature with only the stock coolers. And here are the timings.   <table> <caption align="bottom"> Mean compile times (minutes:seconds) and their ratio </caption> <tr> <th> package </th> <th> desktop </th> <th> laptop </th> <th> ratio </th> </tr> <tr> <td> lens </td> <td align="right"> 01:50 </td> <td align="right"> 02:53 </td> <td align="right"> 1.57 </td> </tr> <tr> <td> criterion </td> <td align="right"> 03:49 </td> <td align="right"> 06:05 </td> <td align="right"> 1.59 </td> </tr> <tr> <td> hledger </td> <td align="right"> 04:28 </td> <td align="right"> 07:51 </td> <td align="right"> 1.75 </td> </tr> <tr> <td> pandoc </td> <td align="right"> 14:07 </td> <td align="right"> 22:30 </td> <td align="right"> 1.59 </td> </tr> <tr> <td> all </td> <td align="right"> 15:20 </td> <td align="right"> 26:48 </td> <td align="right"> 1.75 </td> </tr> </table> <figure> <img src="/img/haskell-compilation-laptop-desktop/timings.svg" alt="The column height represents the mean time, and the error bars (which collapse into thick black lines) show the maximum and minimum of the 3 runs" /> <figcaption aria-hidden="true">The column height represents the mean time, and the error bars (which collapse into thick black lines) show the maximum and minimum of the 3 runs</figcaption> </figure> We can also see how well the desktop/laptop speed ratio is predicted by the parallelism achieved for each package. <img src="/img/haskell-compilation-laptop-desktop/timings-vs-parallelism.svg" /> The average power (where averaging also includes the cooling breaks) drawn during the benchmark was 19W for the laptop and 65W for the desktop. The average idle power was 3W for the laptop and 37W for the desktop. <h2 id="conclusions">Conclusions</h2> <ol type="1"> <li>The overheating laptop issue is real and has happened to me numerous times while working on real projects, forcing me to limit the number of threads and making the compilation even slower. This alone was worth getting a desktop PC.</li> <li>There’s a decent increase in the compilation speed, but it’s not huge. The average time ratio (1.65) is much closer to the ratio of clock frequencies (1.42–1.44) than to the difference in the combined power of all cores. Also, the laptop/desktop ratio grows slowly with the level of parallelism. My interpretation of this is that the (dual-core, 4 threads) laptop is capable of exploiting most of the parallelism available when building these packages. So the way things are today, I’d say a quad-core or probably even a dual-core CPU is enough for a Haskell developer to compile code. That said, I hope that our build systems become better at parallelism over the coming years.</li> <li>In terms of power efficiency, the laptop is a clear winner: twice as power-efficient for compilation (after adjusting for the speed difference) and 13 times as power-efficient when idle. </li> <li>I also played a bit with overclocking the desktop’s CPU. I’m not an experienced overclocker and didn’t dare to go to the extreme settings, but moderate overclocking (raising the clock speed to 3.8 GHz or enabling MSI Game Boost) actually resulted in longer compile times. My understanding is that overclocking affects all cores, while CPU’s default “boosting” logic (which is disabled by overclocking) can significantly boost the clock frequency of one or two cores when needed. The latter seems to be a much better fit for a compilation workload, where most of the cores are idle most of the time.</li> </ol> <h2 id="acknowledgments">Acknowledgments</h2> Thanks to Félix Baylac-Jacqué for educating me about the modern PC parts. Tue, 22 Dec 2020 20:00:00 +0000 https://ro-che.info/articles/2020-12-22-haskell-compilation-laptop-desktop https://ro-che.info//articles/2020-12-22-haskell-compilation-laptop-desktop.html How I integrate ghcid with vim/neovim <a href="https://github.com/ndmitchell/ghcid">ghcid</a> by Neil Mitchell is a simple but robust tool to get instant error messages for your Haskell code. For the most part, it doesn’t require any integration with your editor or IDE, which is exactly what makes it robust—if you can run ghci, you can run ghcid. There’s one feature though for which the editor and ghcid have to talk to one another: the ability to quickly jump to the location of the error. The “official” way to integrate ghcid with neovim is <a href="https://github.com/ndmitchell/ghcid/tree/master/plugins/nvim">the plugin</a>. However, the plugin insists on running ghcid from within nvim, which makes the whole thing less robust. For instance, I often need to run ghci/ghcid in a different environment than my editor, like in a nix shell or a docker container. Therefore, I use a simpler, plugin-less setup. After all, vim/nvim already have a feature to read the compiler output, called <a href="https://neovim.io/doc/user/quickfix.html">quickfix</a>, and ghcid is able to write ghci’s output to a file. All we need is a few tweaks to make them play well together. This article describes the setup, which I’ve been happily using for 1.5 years now. <h2 id="ghcid-setup">ghcid setup</h2> ghcid passes some flags to ghci which makes its output a bit harder to parse. Therefore, I build a modified version of ghcid, with a different default set of flags. (There are probably ways to achieve this that do not require recompiling ghcid, but this is what I prefer—so that when I run ghcid, it simply does what I want.) The patch you need to apply is very simple: <div class="sourceCode" id="cb1"><pre class="sourceCode diff"><code class="sourceCode diff"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>--- src/Ghcid.hs <a href="#cb1-2" aria-hidden="true" tabindex="-1"></a>+++ src/Ghcid.hs <a href="#cb1-3" aria-hidden="true" tabindex="-1"></a>@@ -97,7 +97,7 @@ options = cmdArgsMode $ Options <a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> ,restart = [] &= typ "PATH" &= help "Restart the command when the given file or directory contents change (defaults to .ghci and any .cabal file, unless when using stack or a custom command)" <a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> ,reload = [] &= typ "PATH" &= help "Reload when the given file or directory contents change (defaults to none)" <a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> ,directory = "." &= typDir &= name "C" &= help "Set the current directory" <a href="#cb1-7" aria-hidden="true" tabindex="-1"></a>- ,outputfile = [] &= typFile &= name "o" &= help "File to write the full output to" <a href="#cb1-8" aria-hidden="true" tabindex="-1"></a>+ ,outputfile = ["quickfix"] &= typFile &= name "o" &= help "File to write the full output to" <a href="#cb1-9" aria-hidden="true" tabindex="-1"></a> ,ignoreLoaded = False &= explicit &= name "ignore-loaded" &= help "Keep going if no files are loaded. Requires --reload to be set." <a href="#cb1-10" aria-hidden="true" tabindex="-1"></a> ,poll = Nothing &= typ "SECONDS" &= opt "0.1" &= explicit &= name "poll" &= help "Use polling every N seconds (defaults to using notifiers)" <a href="#cb1-11" aria-hidden="true" tabindex="-1"></a> ,max_messages = Nothing &= name "n" &= help "Maximum number of messages to print" <a href="#cb1-12" aria-hidden="true" tabindex="-1"></a>--- src/Language/Haskell/Ghcid/Util.hs <a href="#cb1-13" aria-hidden="true" tabindex="-1"></a>+++ src/Language/Haskell/Ghcid/Util.hs <a href="#cb1-14" aria-hidden="true" tabindex="-1"></a>@@ -47,7 +47,8 @@ ghciFlagsRequiredVersioned = <a href="#cb1-15" aria-hidden="true" tabindex="-1"></a> -- | Flags that make ghcid work better and are supported on all GHC versions <a href="#cb1-16" aria-hidden="true" tabindex="-1"></a> ghciFlagsUseful :: [String] <a href="#cb1-17" aria-hidden="true" tabindex="-1"></a> ghciFlagsUseful = <a href="#cb1-18" aria-hidden="true" tabindex="-1"></a>- ["-ferror-spans" -- see #148 <a href="#cb1-19" aria-hidden="true" tabindex="-1"></a>+ ["-fno-error-spans" <a href="#cb1-20" aria-hidden="true" tabindex="-1"></a>+ ,"-fno-diagnostics-show-caret" <a href="#cb1-21" aria-hidden="true" tabindex="-1"></a> ,"-j" -- see #153, GHC 7.8 and above, but that's all I support anyway <a href="#cb1-22" aria-hidden="true" tabindex="-1"></a> ]</code></pre></div> Alternatively, you can clone my fork of ghcid at <a href="https://github.com/UnkindPartition/ghcid" class="uri">https://github.com/UnkindPartition/ghcid</a>, which already contains the patch. Apart from changing the default flags passed to ghci, it also tells ghcid to write the ghci output to the file called <code>quickfix</code> by default, so that you don’t have to write <code>-o quickfix</code> on the command line every time. <h2 id="vimneovim-setup">vim/neovim setup</h2> Here are the vim pieces that you’ll need to put into your <code>.vimrc</code> or <code>init.vim</code>. First, set the <code>errorformat</code> option to tell vim how to parse ghci’s error messages: <pre class="viml"><code>set errorformat=%C%*\\s•\ %m, \%-C\ %.%#, \%A%f:%l:%c:\ %t%.%#</code></pre> Don’t ask me how it works—it’s been a long time since I wrote it—but it works. Next, I prefer to define a few keybindings that make quickfix’ing easier: <pre class="viml"><code>map <F5> :cfile quickfix<CR> map <C-j> :cnext<CR> map <C-k> :cprevious<CR></code></pre> When I see any errors in the ghcid window, I press <code>F5</code> to load them into vim and jump to the first error. Then, if I need to, I use Ctrl-j and Ctrl-k to jump between different errors. Wed, 08 Jul 2020 20:00:00 +0000 https://ro-che.info/articles/2020-07-08-integrate-ghcid-vim https://ro-che.info//articles/2020-07-08-integrate-ghcid-vim.html Visualizing Haskell heap profiles in 2020 Heap profiling is a feature of the Glasgow Haskell Compiler (GHC) that lets a program record its own memory usage by type, module, cost center, or other attribute, and write it to a <code>program.hp</code> file. Here I review the existing tools—and introduce a new one—for visualizing and analyzing these profiles. <h2 id="hp2ps">hp2ps</h2> hp2ps is the standard heap profile visualizer, as it comes bundled with GHC. Run it as <pre><code>hp2ps -c benchmark.hp</code></pre> (where <code>-c</code> makes the output colored), and it will produce the file <code>benchmark.ps</code>, which you can open with many document viewers. Here’s what the output looks like: <figure> <img src="/img/hp2ps.svg" alt="An example graph produced by hp2ps" /> <figcaption aria-hidden="true">An example graph produced by hp2ps</figcaption> </figure> The example shows a heap profile by the cost center stack that allocated the data. As I mentioned, there are many other types of heap profiles, but this is what I’ll be using here as an example. As you see, the cost centers on the right are truncated. I usually like to see them longer. They are actually truncated by the profiled program itself, not by the visualizer, so to get longer profiles, rerun your program with <code>+RTS -hc -L500</code> to increase the maximum length from the default 25 to, say, 500. However, hp2ps doesn’t deal well with long cost center stacks (or other long identifiers) by default: the whole page would be filled with identifiers, and there would be no room left for the graph itself. To work around that, pass <code>-M</code> to hp2ps. It produces a two-page .ps file, with the legend on the first page and the graph on the second one. I found that viewers like Okular and Evince only display the second page of the two-page .ps file, but it works if you first convert the output to pdf with ps2pdf. Here’s what the output looks like: <figure> <img src="/img/hp2ps-page1.svg" /> <img src="/img/hp2ps-page2.svg" /> <figcaption> Two-page output from hp2ps -M </figcaption> </figure> <h2 id="hp2pretty">hp2pretty</h2> hp2pretty by Claude Heiland-Allen has a few advantages over hp2ps: a nicer output with transparency and grid lines, truncation of long cost center stacks, and the ability to write the full cost center stacks to a file using a <code>--key</code> option. Run it simply as <pre><code>hp2pretty benchmark.hp</code></pre> and it will produce a file named <code>benchmark.svg</code>. <figure> <img src="/img/hp2pretty.svg" alt="Example output of hp2pretty" /> <figcaption aria-hidden="true">Example output of hp2pretty</figcaption> </figure> <h2 id="hpd3.js">hp/D3.js</h2> hp/D3.js by Edward Z. Yang is an online tool to visualize Haskell heap profiles. There’s a hosted version at <a href="https://heap.ezyang.com/">heap.ezyang.com</a>, and there is <a href="https://github.com/ezyang/hpd3js">the source code</a> on GitHub. I wasn’t able to build the source code due to the dependency on hp2any (see below), but the hosted version still works. The disadvantage of the hosted version is that you have to upload your heap profile to the server, and it becomes public—consider this when working on proprietary projects. (The profile files do not contain any source code, but even the function names and call stacks may reveal too much information in some cases.) hp/D3.js offers a choice of three different styles of pretty graphs shown below. You can also <a href="https://heap.ezyang.com/view/f267b68e008f3f5cc64a088106f8882ec8f097c6">browse this profile</a> yourself. There are some cool interactive features, like the entry’s name or call stack being highlighted when you hover the corresponding part of the graph. <figure> <img src="/img/hpd3js-1.png" alt="hp/d3.js: stacked graph" /> <figcaption aria-hidden="true">hp/d3.js: stacked graph</figcaption> </figure> <figure> <img src="/img/hpd3js-2.png" alt="hp/d3.js: normalized stacked graph" /> <figcaption aria-hidden="true">hp/d3.js: normalized stacked graph</figcaption> </figure> <figure> <img src="/img/hpd3js-3.png" alt="hp/d3.js: overlayed area graph" /> <figcaption aria-hidden="true">hp/d3.js: overlayed area graph</figcaption> </figure> <h2 id="perl-r">Perl & R</h2> Sometimes a quick look at the heap profile graph is all you need to understand what to do next. Other times, a more detailed analysis is required. In such cases, my favorite way is to convert an .hp file to csv and load it into R. To convert an .hp file to csv, I wrote a short Perl script, <a href="/files/2020-05-14-visualize-haskell-heap-profiles/hp2csv">hp2csv</a>. (Unlike many tools written in Haskell, there’s a good chance it’ll continue working in 10 years.) Put it somewhere in your PATH, make it executable (<code>chmod +x ~/bin/hp2csv</code>), and run <pre><code>hp2csv benchmark.hp > benchmark.csv</code></pre> The CSV has a simple format: <pre><code>time,name,value 0.094997,(487)getElements/CAF:getElements,40 0.094997,(415)CAF:$cfoldl'_r3hK,32 0.094997,(412)CAF:$ctoList_r3hH,32 0.094997,(482)match/main/Main.CAF,24 0.094997,(480)main/Main.CAF,32</code></pre> where <code>time</code> is the time in seconds since the program start, <code>name</code> is the name of the cost center/type/etc. (depending on what kind of heap profiling you did), and <code>value</code> is the number of bytes. Now let’s load this into R and try to reproduce the above graphs using ggplot. <div class="sourceCode" id="cb5"><pre class="sourceCode r"><code class="sourceCode r"><a href="#cb5-1" aria-hidden="true" tabindex="-1"></a>library(tidyverse) <a href="#cb5-2" aria-hidden="true" tabindex="-1"></a>library(scales) # for a somewhat better color scheme <a href="#cb5-3" aria-hidden="true" tabindex="-1"></a> <a href="#cb5-4" aria-hidden="true" tabindex="-1"></a>csv <- read_csv("benchmark.csv") %>% <a href="#cb5-5" aria-hidden="true" tabindex="-1"></a> # convert bytes to megabytes <a href="#cb5-6" aria-hidden="true" tabindex="-1"></a> mutate(value = value / 1e6) %>% <a href="#cb5-7" aria-hidden="true" tabindex="-1"></a> # absent measurements are 0s <a href="#cb5-8" aria-hidden="true" tabindex="-1"></a> complete(time,name, fill = list(value = 0)) <a href="#cb5-9" aria-hidden="true" tabindex="-1"></a> <a href="#cb5-10" aria-hidden="true" tabindex="-1"></a> <a href="#cb5-11" aria-hidden="true" tabindex="-1"></a># find top 15 entries and sort them <a href="#cb5-12" aria-hidden="true" tabindex="-1"></a>top_names <- csv %>% <a href="#cb5-13" aria-hidden="true" tabindex="-1"></a> group_by(name) %>% <a href="#cb5-14" aria-hidden="true" tabindex="-1"></a> summarize(sum_value = sum(value)) %>% <a href="#cb5-15" aria-hidden="true" tabindex="-1"></a> arrange(desc(sum_value)) %>% <a href="#cb5-16" aria-hidden="true" tabindex="-1"></a> head(n=15) %>% <a href="#cb5-17" aria-hidden="true" tabindex="-1"></a> mutate(name_sorted = str_trunc(name,30), <a href="#cb5-18" aria-hidden="true" tabindex="-1"></a> name_sorted = factor(name_sorted, levels=name_sorted)) <a href="#cb5-19" aria-hidden="true" tabindex="-1"></a>top_entries <- <a href="#cb5-20" aria-hidden="true" tabindex="-1"></a> inner_join(csv, top_names, by="name") <a href="#cb5-21" aria-hidden="true" tabindex="-1"></a> <a href="#cb5-22" aria-hidden="true" tabindex="-1"></a># Create a custom color palette based on the 'viridis' palette. <a href="#cb5-23" aria-hidden="true" tabindex="-1"></a># Use 'sample' to shuffle the colors, <a href="#cb5-24" aria-hidden="true" tabindex="-1"></a># so that adjacent areas are not similarly colored. <a href="#cb5-25" aria-hidden="true" tabindex="-1"></a>colors <- function(n) { <a href="#cb5-26" aria-hidden="true" tabindex="-1"></a> set.seed(2020) <a href="#cb5-27" aria-hidden="true" tabindex="-1"></a> sample(viridis_pal(option="A",alpha=0.7)(n)) <a href="#cb5-28" aria-hidden="true" tabindex="-1"></a>} <a href="#cb5-29" aria-hidden="true" tabindex="-1"></a> <a href="#cb5-30" aria-hidden="true" tabindex="-1"></a>theme_set(theme_bw()) <a href="#cb5-31" aria-hidden="true" tabindex="-1"></a>ggplot(top_entries,aes(time,value,fill=name_sorted)) + <a href="#cb5-32" aria-hidden="true" tabindex="-1"></a> geom_area(position="stack") + <a href="#cb5-33" aria-hidden="true" tabindex="-1"></a> discrete_scale(aesthetics = "fill", <a href="#cb5-34" aria-hidden="true" tabindex="-1"></a> scale_name = "viridis modified", <a href="#cb5-35" aria-hidden="true" tabindex="-1"></a> palette = colors) + <a href="#cb5-36" aria-hidden="true" tabindex="-1"></a> scale_y_continuous(breaks=function(limits) seq(0, floor(limits[[2]]), by=10)) + <a href="#cb5-37" aria-hidden="true" tabindex="-1"></a> labs(x="seconds", y="MB", fill = "Cost center")</code></pre></div> <figure> <img src="/img/heap-profile-ggplot-stacked.svg" alt="A stacked graph of the heap profile (produced with ggplot)" /> <figcaption aria-hidden="true">A stacked graph of the heap profile (produced with ggplot)</figcaption> </figure> But these stacked plots are not always the best way to represent the data. Let’s see what happens if we try a simple line plot. <div class="sourceCode" id="cb6"><pre class="sourceCode r"><code class="sourceCode r"><a href="#cb6-1" aria-hidden="true" tabindex="-1"></a>top_entries %>% <a href="#cb6-2" aria-hidden="true" tabindex="-1"></a> ggplot(aes(time,value,color=name_sorted)) + <a href="#cb6-3" aria-hidden="true" tabindex="-1"></a> geom_line() + <a href="#cb6-4" aria-hidden="true" tabindex="-1"></a> scale_y_continuous(breaks=function(limits) seq(0, floor(limits[[2]]), by=5)) + <a href="#cb6-5" aria-hidden="true" tabindex="-1"></a> labs(x="seconds", y="MB", color = "Cost center")</code></pre></div> <figure> <img src="/img/heap-profile-ggplot-lines.svg" alt="A line graph of the heap profile" /> <figcaption aria-hidden="true">A line graph of the heap profile</figcaption> </figure> This looks weird, doesn’t it? Do those lines merge, or does one of them just disappear? To disentangle this graph a bit, we can add a random offset for each cost-center. <div class="sourceCode" id="cb7"><pre class="sourceCode r"><code class="sourceCode r"><a href="#cb7-1" aria-hidden="true" tabindex="-1"></a>set.seed(2020) <a href="#cb7-2" aria-hidden="true" tabindex="-1"></a>top_entries %>% <a href="#cb7-3" aria-hidden="true" tabindex="-1"></a> group_by(name) %>% <a href="#cb7-4" aria-hidden="true" tabindex="-1"></a> mutate(value = value + runif(1,0,3)) %>% <a href="#cb7-5" aria-hidden="true" tabindex="-1"></a> ungroup %>% <a href="#cb7-6" aria-hidden="true" tabindex="-1"></a> ggplot(aes(time,value,color=name_sorted)) + <a href="#cb7-7" aria-hidden="true" tabindex="-1"></a> geom_line() + <a href="#cb7-8" aria-hidden="true" tabindex="-1"></a> scale_y_continuous(breaks=function(limits) seq(0, floor(limits[[2]]), by=5)) + <a href="#cb7-9" aria-hidden="true" tabindex="-1"></a> labs(x="seconds", y="MB", color = "Cost center")</code></pre></div> <figure> <img src="/img/heap-profile-ggplot-lines-random-offset.svg" alt="A line graph of the heap profile, with a random offset added per cost center" /> <figcaption aria-hidden="true">A line graph of the heap profile, with a random offset added per cost center</figcaption> </figure> So it’s not a glitch, and indeed several cost centers have identical dynamics. It’s not hard to imagine why this could happen: think about tuples whose elements occupy the same amount of space but are produced by different cost centers. As these tuples are consumed and garbage-collected, the corresponding lines remain in perfect sync. But this effect wasn’t obvious at all from the stacked plot, was it? Another thing that is hard to understand from a stacked plot is how different cost centers compare, say, in terms of their maximum resident size. But in R, we can easily visualize this with a simple bar plot: <div class="sourceCode" id="cb8"><pre class="sourceCode r"><code class="sourceCode r"><a href="#cb8-1" aria-hidden="true" tabindex="-1"></a>top_entries <- csv %>% <a href="#cb8-2" aria-hidden="true" tabindex="-1"></a> group_by(name) %>% <a href="#cb8-3" aria-hidden="true" tabindex="-1"></a> summarize(max_value = max(value)) %>% <a href="#cb8-4" aria-hidden="true" tabindex="-1"></a> filter(max_value >= 1) %>% <a href="#cb8-5" aria-hidden="true" tabindex="-1"></a> arrange(max_value) %>% <a href="#cb8-6" aria-hidden="true" tabindex="-1"></a> mutate(name = str_trunc(name, 120), name = factor(name, levels=name)) <a href="#cb8-7" aria-hidden="true" tabindex="-1"></a>ggplot(top_entries, aes(name,max_value)) + geom_col(fill=viridis_pal(alpha=0.7)(5)[[4]]) + <a href="#cb8-8" aria-hidden="true" tabindex="-1"></a> geom_text(aes(name,label=name),y=0,hjust="left") + <a href="#cb8-9" aria-hidden="true" tabindex="-1"></a> labs(x="Cost center", y="Memory, MB") + <a href="#cb8-10" aria-hidden="true" tabindex="-1"></a> scale_x_discrete(breaks=NULL) + <a href="#cb8-11" aria-hidden="true" tabindex="-1"></a> scale_y_continuous(breaks=function(limits) seq(0, floor(limits[[2]]), by=5)) + <a href="#cb8-12" aria-hidden="true" tabindex="-1"></a> coord_flip()</code></pre></div> <figure> <img src="/img/heap-profile-ggplot-barplot.svg" alt="A bar plot of the maximum residenct size per cost center" /> <figcaption aria-hidden="true">A bar plot of the maximum residenct size per cost center</figcaption> </figure> Finally, in R you are not limited to just visualization; you can do all sorts of data analyses. For instance, a few years back I needed to verify that, in a server process, a certain function was not consuming increasingly more memory over time. I used this technique to load the heap profile into R and verify that with more confidence that I would have had from looking at a stacked graph. <h2 id="hp2any">hp2any</h2> One issue with big Haskell projects is that, if not actively maintained, they tend to bitrot due to the changes in the compiler, the Haskell dependencies or even the C dependencies. One such example is Patai Gergely’s <a href="https://github.com/cobbpg/hp2any">hp2any</a>. It no longer builds with the current version of the <code>network</code> package because of some API changes. But even when I tried to build it with the included <code>stack.yaml</code> file, I got <pre><code>glib > Linking /tmp/stack214336/glib-0.13.6.0/.stack-work/dist/x86_64-linux-tinfo6/Cabal-2.2.0.1/setup/setup ... glib > Configuring glib-0.13.6.0... glib > build glib > Preprocessing library for glib-0.13.6.0.. glib > setup: Error in C header file. glib > glib > /usr/include/glib-2.0/glib/gspawn.h:76: (column 22) [FATAL] glib > >>> Syntax error! glib > The symbol `__attribute__' does not fit here. glib > </code></pre> I’m guessing (only guessing) that this issue is fixed in the latest versions of the glib Haskell package, but we can’t benefit from that when using an old <code>stack.yaml</code>. This also shows a flaw in some people’s argument that if you put prospective upper bounds on your Haskell dependencies, your projects will build forever. (At this point, someone will surely mention nix and how it would’ve helped here. It probably would, but as an owner of a 50GB /nix directory, I’m not so enthusiastic about adding another 5GB there consisting of old OpenGL and GTK libraries just to get a heap profile visualizer.) Thu, 14 May 2020 20:00:00 +0000 https://ro-che.info/articles/2020-05-14-visualize-haskell-heap-profiles https://ro-che.info//articles/2020-05-14-visualize-haskell-heap-profiles.html Compile and link a Haskell package against a local C library Let’s say you want to build a Haskell package with a locally built version of a C library for testing/debugging purposes. Doing this is easy once you know the right option names, but finding this information took me some time, so I’m recording it here for the reference. Let’s say the headers of your local library are in <code>/home/user/src/mylib/include</code> and the library files (<code>*.so</code> or <code>*.a</code>) are in <code>/home/user/src/mylib/lib</code>. Then you can put the following into your <code>stack.yaml</code> (tested with stack v2.2.0; instructions for cabal-install should be similar): <div class="sourceCode" id="cb1"><pre class="sourceCode yaml"><code class="sourceCode yaml"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>extra-include-dirs: <a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> - /home/user/src/mylib/include <a href="#cb1-3" aria-hidden="true" tabindex="-1"></a>extra-lib-dirs: <a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> - /home/user/src/mylib/lib <a href="#cb1-5" aria-hidden="true" tabindex="-1"></a>ghc-options: <a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> "$locals": -optl=-Wl,-rpath,/home/user/src/mylib/lib</code></pre></div> Here <code>"$locals"</code> means <a href="https://docs.haskellstack.org/en/stable/yaml_configuration/#ghc-options">“apply the options to all local packages”</a>. Tue, 07 Apr 2020 20:00:00 +0000 https://ro-che.info/articles/2020-04-07-haskell-local-c-library https://ro-che.info//articles/2020-04-07-haskell-local-c-library.html

Original Source | Taken Source