| CARVIEW |
Select Language
HTTP/2 200
date: Sat, 27 Dec 2025 00:02:15 GMT
content-type: application/xml
access-control-allow-origin: *
cache-control: public, max-age=0, must-revalidate
nel: {"report_to":"cf-nel","success_fraction":0.0,"max_age":604800}
referrer-policy: strict-origin-when-cross-origin
x-content-type-options: nosniff
vary: accept-encoding
report-to: {"group":"cf-nel","max_age":604800,"endpoints":[{"url":"https://a.nel.cloudflare.com/report/v4?s=bpu6btVZfj%2FO%2BGhPWDuu9ctvBAiZOZn8Tq9n5Fjf9lWBX9z98WiXMmRAAfQvl5YU1xEPvgSpPfOJhx6cKpZtDferDBSOoxjPwjNKxJ7i"}]}
etag: W/"e08f3da777628c956ebbe6e59129f75e"
content-encoding: gzip
server: cloudflare
cf-cache-status: DYNAMIC
cf-ray: 9b448c4c197248f2-BOM
alt-svc: h3=":443"; ma=86400
Paul's Notes https://notes.pault.ag/Recent content on Paul's Notes Hugo -- gohugo.io en-us Mon, 27 Oct 2025 13:15:00 -0400 It's NOT always DNS. https://notes.pault.ag/its-not-always-dns/Mon, 27 Oct 2025 13:15:00 -0400 https://notes.pault.ag/its-not-always-dns/ <p>I’ve written down a new rule (no name, sorry) that I’ll be repeating to myself
and those around me. <strong>“If you can replace ‘DNS’ with ‘key value store mapping
a name to an ip’ and it still makes sense, it was not, in fact, DNS.”</strong> Feel
free to repeat it along with me.</p>
<p>Sure, the “It’s always DNS” meme is funny the first few hundred times you see
it – but what’s less funny is when critical thinking ends because a DNS query
is involved. DNS failures are often the first observable problem <em>because</em>
it’s one of the first things that needs to be done. DNS is fairly complicated,
implementation-dependent, and at times – frustrating to debug – but it is not
the operational hazard it’s made out to be. It’s at best a shallow take, and at
worst actively holding teams back from understanding their true operational
risks.</p>
<p>IP connectivity failures between a host and the rest of the network is <em>not</em> a
reason to blame DNS. This would happen no matter how you distribute the updated
name to IP mappings. Wiping out
<a href="https://aws.amazon.com/message/101925/">all the records during the course of operations due to an automation bug</a>
is <em>not</em> a reason to blame DNS. This, too, would happen no matter how you
distribute the name to IP mappings. Something made the choice to delete all the
mappings, and <a href="https://web.archive.org/web/20251005205731/https://www.team.net/mjb/hawg.html">it did what you asked it to do</a></p>
<p>There’s plenty of annoying DNS specific sharp edges to blame when things <em>do</em>
go wrong (like <code>8.8.8.8</code> and <code>1.1.1.1</code> disagreeing about resolving a domain
because of DNSSEC, or since we’re on the topic, a
<a href="https://slack.engineering/what-happened-during-slacks-dnssec-rollout/">DNSSEC rollout bricking prod for hours</a>)
for us to be cracking jokes anytime a program makes a DNS request.</p>
<p>We can do better.</p> The Promised LAN https://notes.pault.ag/tpl/Mon, 16 Jun 2025 11:58:00 -0400 https://notes.pault.ag/tpl/ <p>The Internet has changed a lot in the last 40+ years. Fads have come and gone.
Network protocols have been designed, deployed, adopted, and abandoned.
Industries have come and gone. The types of people on the internet have changed
a lot. The number of people on the internet has changed a lot, creating an
information medium unlike anything ever seen before in human history. There’s a
lot of good things about the Internet as of 2025, <strong>but there’s also an
inescapable hole in what it used to be, for me</strong>.</p>
<p>I miss being able to throw a site up to send around to friends to play with
without worrying about hordes of AI-feeding HTML combine harvesters DoS-ing my
website, costing me thousands in network transfer for the privilege. I miss
being able to put a lightly authenticated game server up and not worry too much
at night – wondering if that process is now mining bitcoin. I miss being able
to run a server in my home closet. Decades of cat and mouse games have rendered
running a mail server nearly impossible. Those who are “brave” enough to try
are met with weekslong stretches of delivery failures and countless hours
yelling ineffectually into a pipe that leads from the cheerful lobby of some
disinterested corporation directly into a void somewhere 4 layers below ground
level.</p>
<p>I miss the spirit of curiosity, exploration, and trying new things. I miss
building things for fun without having to worry about being too successful,
after which “security” offices start demanding my supplier paperwork in
triplicate as heartfelt thanks from their engineering teams. I miss communities
that are run because it is important to them, not for ad revenue. I miss
community operated spaces and having more than four websites that are all full
of nothing except screenshots of each other.</p>
<p>Every other page I find myself on now has an AI generated click-bait title,
shared for rage-clicks all brought-to-you-by-our-sponsors–completely covered
wall-to-wall with popup modals, telling me how much they respect my privacy,
with the real content hidden at the bottom bracketed by deceptive ads served by
companies that definitely know which new coffee shop I went to last month.</p>
<p>This is wrong, and those who have seen what was know it.</p>
<p><strong>I can’t keep doing it. I’m not doing it any more. I reject the notion that
this is as it needs to be. It is wrong. The hole left in what the Internet used
to be must be filled. I will fill it.</strong></p>
<h2 id="what-comes-before-part-b">What comes before part b?</h2>
<p>Throughout the 2000s, some of my favorite memories were from LAN parties at my
friends’ places. Dragging your setup somewhere, long nights playing games,
goofing off, even building software all night to get something working—being
able to do something fiercely technical in the context of a uniquely social
activity. It wasn’t really much about the games or the projects—it was an
excuse to spend time together, just hanging out. A huge reason I learned so
much in college was that campus was a non-stop LAN party – we could freely
stand up servers, talk between dorms on the LAN, and hit my dorm room computer
from the lab. Things could go from individual to social in the matter of
seconds. The Internet used to work this way—my dorm had public IPs handed out
by DHCP, and my workstation could serve traffic from anywhere on the internet.
I haven’t been back to campus in a few years, but I’d be surprised if this were
still the case.</p>
<p>In December of 2021, three of us got together and connected our houses together
in what we now call The Promised LAN. The idea is simple—fill the hole we feel
is gone from our lives. Build our own always-on 24/7 nonstop LAN party. Build a
space that is intrinsically social, even though we’re doing technical things.
We can freely host insecure game servers or one-off side projects without
worrying about what someone will do with it.</p>
<p>Over the years, it’s evolved very slowly—we haven’t pulled any all-nighters.
Our mantra has become “old growth”, building each layer carefully. As of May
2025, the LAN is now 19 friends running around 25 network segments. Those 25
networks are connected to 3 backbone nodes, exchanging routes and IP traffic
for the LAN. We refer to the set of backbone operators as “The Bureau of LAN
Management”. Combined decades of operating critical infrastructure has
driven The Bureau to make a set of well-understood, boring, predictable,
interoperable and easily debuggable decisions to make this all happen.
<a href="https://tpl.house/">Nothing here is exotic or even technically interesting</a>.</p>
<h2 id="applications-of-trusting-trust">Applications of trusting trust</h2>
<p>The hardest part, however, is rejecting the idea that anything outside our own
LAN is untrustworthy—nearly irreversible damage inflicted on us by the
Internet. We have solved this by not solving it. We strictly control
membership—the absolute hard minimum for joining the LAN requires 10 years of
friendship with at least one member of the Bureau, with another 10 years of
friendship planned. Members of the LAN can veto new members even if all other
criteria is met. Even with those strict rules, there’s no shortage of friends
that meet the qualifications—but we are not equipped to take that many folks
on. It’s hard to join—-both socially and technically. Doing something malicious
on the LAN requires a lot of highly technical effort upfront, and it would
endanger a decade of friendship. We have relied on those human, social,
interpersonal bonds to bring us all together. It’s worked for the last 4 years,
and it should continue working until we think of something better.</p>
<p>We assume roommates, partners, kids, and visitors all have access to The
Promised LAN. If they’re let into our friends’ network, there is a level of
trust that works transitively for us—I trust them to be on mine. This LAN is
not for “security”, rather, the network border is a social one. Benign
“hacking”—in the original sense of misusing systems to do fun and interesting
things—is encouraged. Robust ACLs and firewalls on the LAN are, by definition,
an interpersonal—not technical—failure. We all trust every other network
operator to run their segment in a way that aligns with our collective values
and norms.</p>
<p>Over the last 4 years, we’ve grown our own culture and fads—around half of the
people on the LAN have thermal receipt printers with open access, for printing
out quips or jokes on each other’s counters. It’s incredible how much network
transport and a trusting culture gets you—there’s a 3-node IRC network, exotic
hardware to gawk at, radios galore, a NAS storage swap, LAN only email, and
even a SIP phone network of “redphones”.</p>
<h2 id="diy">DIY</h2>
<p>We do not wish to, nor will we, rebuild the internet. We do not wish to, nor
will we, scale this. We will never be friends with enough people, as hard as we
may try. Participation hinges on us all having fun. As a result, membership
will never be open, and we will never have enough connected LANs to deal with
the technical and social problems that start to happen with scale. This is a
feature, not a bug.</p>
<p>This is a call for you to do the same. Build your own LAN. Connect it with
friends’ homes. Remember what is missing from your life, and fill it in. Use
software you know how to operate and get it running. Build slowly. Build your
community. Do it with joy. Remember how we got here. Rebuild a community space
that doesn’t need to be mediated by faceless corporations and ad revenue. Build
something sustainable that brings you joy. Rebuild something you use daily.</p>
<p>Bring back what we’re missing.</p> boot2kier https://notes.pault.ag/boot2kier/Thu, 20 Feb 2025 09:40:00 -0500 https://notes.pault.ag/boot2kier/ <p>I can’t remember exactly the joke I was making at the time in my
<a href="https://zoo.dev">work’s</a> slack instance (I’m sure it wasn’t particularly
funny, though; and not even worth re-reading the thread to work out), but it
wound up with me writing a UEFI binary for the punchline. Not to spoil the
ending but it worked - no pesky kernel, no messing around with “userland”. I
guess the only part of this you really need to know for the setup here is that
it was a <a href="https://en.wikipedia.org/wiki/Severance_(TV_series)">Severance</a> joke,
which is some fantastic TV. If you haven’t seen it, this post will seem perhaps
weirder than it actually is. I promise I haven’t joined any new cults. For
those who have seen it, the payoff to my joke is that I wanted my machine to
boot directly to an image of
<a href="https://severance-tv.fandom.com/wiki/Kier_Eagan">Kier Eagan</a>.</p>
<p>As for how to do it – I figured I’d give the <a href="https://docs.rs/uefi/latest/uefi/">uefi
crate</a> a shot, and see how it is to use,
since this is a low stakes way of trying it out. In general, this isn’t the
sort of thing I’d usually post about – except this wound up being easier and
way cleaner than I thought it would be. That alone is worth sharing, in the
hopes someome comes across this in the future and feels like they, too, can
write something fun targeting the UEFI.</p>
<p>First thing’s first – gotta create a rust project (I’ll leave that part to you
depending on your life choices), and to add the <code>uefi</code> crate to your
<code>Cargo.toml</code>. You can either use <code>cargo add</code> or add a line like this by hand:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-toml" data-lang="toml"><span style="display:flex;"><span><span style="color:#a6e22e">uefi</span> = { <span style="color:#a6e22e">version</span> = <span style="color:#e6db74">"0.33"</span>, <span style="color:#a6e22e">features</span> = [<span style="color:#e6db74">"panic_handler"</span>, <span style="color:#e6db74">"alloc"</span>, <span style="color:#e6db74">"global_allocator"</span>] }
</span></span></code></pre></div><p>We also need to teach cargo about how to go about building for the UEFI target,
so we need to create a <code>rust-toolchain.toml</code> with one (or both) of the UEFI
targets we’re interested in:</p>
<aside class="left">
I think there's a UEFI for riscv64 too, but I haven't found notes about it
in Rust-land.
</aside>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-toml" data-lang="toml"><span style="display:flex;"><span>[<span style="color:#a6e22e">toolchain</span>]
</span></span><span style="display:flex;"><span><span style="color:#a6e22e">targets</span> = [<span style="color:#e6db74">"aarch64-unknown-uefi"</span>, <span style="color:#e6db74">"x86_64-unknown-uefi"</span>]
</span></span></code></pre></div><p>Unfortunately, I wasn’t able to use the
<a href="https://docs.rs/image/latest/image/">image</a> crate,
since it won’t build against the <code>uefi</code> target. This looks like it’s
because rustc had no way to compile the required floating point operations
within the <code>image</code> crate without hardware floating point instructions
specifically. Rust tends to punt a lot of that to <code>libm</code> usually, so this isnt
entirely shocking given we’re <code>nostd</code> for a non-hardfloat target.</p>
<aside class="right">
I didn't file any bugs or even track them down between the image crate
and rustc, since I figured this isn't actionable for anyone involved aside
from "implement soft floats in the compiler to backfill this target".
</aside>
<p>So-called “softening” requires a software floating point implementation that
the compiler can use to “polyfill” (feels weird to use the term polyfill here,
but I guess it’s spiritually right?) the lack of hardware floating point
operations, which rust hasn’t implemented for this target yet. As a result, I
changed tactics, and figured I’d use <code>ImageMagick</code> to pre-compute the pixels
from a <code>jpg</code>, rather than doing it at runtime. A bit of a bummer, since I need
to do more out of band pre-processing and hardcoding, and updating the image
kinda sucks as a result – but it’s entirely manageable.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-sh" data-lang="sh"><span style="display:flex;"><span>$ convert -resize 1280x900 kier.jpg kier.full.jpg
</span></span><span style="display:flex;"><span>$ convert -depth <span style="color:#ae81ff">8</span> kier.full.jpg rgba:kier.bin
</span></span></code></pre></div><p>This will take our input file (<code>kier.jpg</code>), resize it to get as close to the
desired resolution as possible while maintaining aspect ration, then convert it
from a <code>jpg</code> to a flat array of 4 byte <code>RGBA</code> pixels. Critically, it’s also
important to remember that the size of the <code>kier.full.jpg</code> file may not actually
be the requested size – it will not change the aspect ratio, so be sure to
make a careful note of the resulting size of the <code>kier.full.jpg</code> file.</p>
<p>Last step with the image is to compile it into our Rust bianary, since we
don’t want to struggle with trying to read this off disk, which is thankfully
real easy to do.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span><span style="color:#66d9ef">const</span> <span style="color:#66d9ef">KIER</span>: <span style="color:#66d9ef">&</span>[<span style="color:#66d9ef">u8</span>] <span style="color:#f92672">=</span> <span style="color:#a6e22e">include_bytes!</span>(<span style="color:#e6db74">"../kier.bin"</span>);
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">const</span> <span style="color:#66d9ef">KIER_WIDTH</span>: <span style="color:#66d9ef">usize</span> <span style="color:#f92672">=</span> <span style="color:#ae81ff">1280</span>;
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">const</span> <span style="color:#66d9ef">KIER_HEIGHT</span>: <span style="color:#66d9ef">usize</span> <span style="color:#f92672">=</span> <span style="color:#ae81ff">641</span>;
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">const</span> <span style="color:#66d9ef">KIER_PIXEL_SIZE</span>: <span style="color:#66d9ef">usize</span> <span style="color:#f92672">=</span> <span style="color:#ae81ff">4</span>;
</span></span></code></pre></div><p>Remember to use the width and height from the final <code>kier.full.jpg</code> file as the
values for <code>KIER_WIDTH</code> and <code>KIER_HEIGHT</code>. <code>KIER_PIXEL_SIZE</code> is 4, since we
have 4 byte wide values for each pixel as a result of our conversion step into
RGBA. We’ll only use RGB, and if we ever drop the alpha channel, we can drop
that down to 3. I don’t entirely know why I kept alpha around, but I figured it
was fine. My <code>kier.full.jpg</code> image winds up shorter than the requested height
(which is also qemu’s default resolution for me) – which means we’ll get a
semi-annoying black band under the image when we go to run it – but it’ll
work.</p>
<p>Anyway, now that we have our image as bytes, we can get down to work, and
write the rest of the code to handle moving bytes around from in-memory
as a flat block if pixels, and request that they be displayed using the
<a href="https://wiki.osdev.org/GOP">UEFI GOP</a>. We’ll just need to hack up a container
for the image pixels and teach it how to blit to the display.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span><span style="color:#e6db74">/// RGB Image to move around. This isn't the same as an
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">/// `image::RgbImage`, but we can associate the size of
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">/// the image along with the flat buffer of pixels.
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74"></span><span style="color:#66d9ef">struct</span> <span style="color:#a6e22e">RgbImage</span> {
</span></span><span style="display:flex;"><span> <span style="color:#e6db74">/// Size of the image as a tuple, as the
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74"></span> <span style="color:#e6db74">/// (width, height)
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74"></span> size: (<span style="color:#66d9ef">usize</span>, <span style="color:#66d9ef">usize</span>),
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> <span style="color:#e6db74">/// raw pixels we'll send to the display.
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74"></span> inner: Vec<span style="color:#f92672"><</span>BltPixel<span style="color:#f92672">></span>,
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">impl</span> RgbImage {
</span></span><span style="display:flex;"><span> <span style="color:#e6db74">/// Create a new `RgbImage`.
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74"></span> <span style="color:#66d9ef">fn</span> <span style="color:#a6e22e">new</span>(width: <span style="color:#66d9ef">usize</span>, height: <span style="color:#66d9ef">usize</span>) -> <span style="color:#a6e22e">Self</span> {
</span></span><span style="display:flex;"><span> RgbImage {
</span></span><span style="display:flex;"><span> size: (width, height),
</span></span><span style="display:flex;"><span> inner: <span style="color:#a6e22e">vec</span><span style="color:#f92672">!</span>[BltPixel::new(<span style="color:#ae81ff">0</span>, <span style="color:#ae81ff">0</span>, <span style="color:#ae81ff">0</span>); width <span style="color:#f92672">*</span> height],
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> <span style="color:#e6db74">/// Take our pixels and request that the UEFI GOP
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74"></span> <span style="color:#e6db74">/// display them for us.
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74"></span> <span style="color:#66d9ef">fn</span> <span style="color:#a6e22e">write</span>(<span style="color:#f92672">&</span>self, gop: <span style="color:#66d9ef">&</span><span style="color:#a6e22e">mut</span> GraphicsOutput) -> Result {
</span></span><span style="display:flex;"><span> gop.blt(BltOp::BufferToVideo {
</span></span><span style="display:flex;"><span> buffer: <span style="color:#66d9ef">&</span><span style="color:#a6e22e">self</span>.inner,
</span></span><span style="display:flex;"><span> src: <span style="color:#a6e22e">BltRegion</span>::Full,
</span></span><span style="display:flex;"><span> dest: (<span style="color:#ae81ff">0</span>, <span style="color:#ae81ff">0</span>),
</span></span><span style="display:flex;"><span> dims: <span style="color:#a6e22e">self</span>.size,
</span></span><span style="display:flex;"><span> })
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">impl</span> Index<span style="color:#f92672"><</span>(<span style="color:#66d9ef">usize</span>, <span style="color:#66d9ef">usize</span>)<span style="color:#f92672">></span> <span style="color:#66d9ef">for</span> RgbImage {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">type</span> <span style="color:#a6e22e">Output</span> <span style="color:#f92672">=</span> BltPixel;
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">fn</span> <span style="color:#a6e22e">index</span>(<span style="color:#f92672">&</span>self, idx: (<span style="color:#66d9ef">usize</span>, <span style="color:#66d9ef">usize</span>)) -> <span style="color:#66d9ef">&</span><span style="color:#a6e22e">BltPixel</span> {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">let</span> (x, y) <span style="color:#f92672">=</span> idx;
</span></span><span style="display:flex;"><span> <span style="color:#f92672">&</span>self.inner[y <span style="color:#f92672">*</span> self.size.<span style="color:#ae81ff">0</span> <span style="color:#f92672">+</span> x]
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">impl</span> IndexMut<span style="color:#f92672"><</span>(<span style="color:#66d9ef">usize</span>, <span style="color:#66d9ef">usize</span>)<span style="color:#f92672">></span> <span style="color:#66d9ef">for</span> RgbImage {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">fn</span> <span style="color:#a6e22e">index_mut</span>(<span style="color:#f92672">&</span><span style="color:#66d9ef">mut</span> self, idx: (<span style="color:#66d9ef">usize</span>, <span style="color:#66d9ef">usize</span>)) -> <span style="color:#66d9ef">&</span><span style="color:#a6e22e">mut</span> BltPixel {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">let</span> (x, y) <span style="color:#f92672">=</span> idx;
</span></span><span style="display:flex;"><span> <span style="color:#f92672">&</span><span style="color:#66d9ef">mut</span> self.inner[y <span style="color:#f92672">*</span> self.size.<span style="color:#ae81ff">0</span> <span style="color:#f92672">+</span> x]
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>We also need to do some basic setup to get a handle to the UEFI
GOP via the UEFI crate (using
<a href="https://docs.rs/uefi/latest/uefi/boot/fn.get_handle_for_protocol.html">uefi::boot::get_handle_for_protocol</a>
and
<a href="https://docs.rs/uefi/latest/uefi/boot/fn.open_protocol_exclusive.html">uefi::boot::open_protocol_exclusive</a>
for the <a href="https://docs.rs/uefi/latest/uefi/proto/console/gop/struct.GraphicsOutput.html">GraphicsOutput</a>
protocol), so that we have the object we need to pass to <code>RgbImage</code> in order
for it to write the pixels to the display. The only trick here is that the
display on the booted system can really be any resolution – so we need to do
some capping to ensure that we don’t write more pixels than the display can
handle. Writing fewer than the display’s maximum seems fine, though.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span><span style="color:#66d9ef">fn</span> <span style="color:#a6e22e">praise</span>() -> Result {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">let</span> gop_handle <span style="color:#f92672">=</span> boot::get_handle_for_protocol::<span style="color:#f92672"><</span>GraphicsOutput<span style="color:#f92672">></span>()<span style="color:#f92672">?</span>;
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">let</span> <span style="color:#66d9ef">mut</span> gop <span style="color:#f92672">=</span> boot::open_protocol_exclusive::<span style="color:#f92672"><</span>GraphicsOutput<span style="color:#f92672">></span>(gop_handle)<span style="color:#f92672">?</span>;
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> <span style="color:#75715e">// Get the (width, height) that is the minimum of
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span> <span style="color:#75715e">// our image and the display we're using.
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span> <span style="color:#66d9ef">let</span> (width, height) <span style="color:#f92672">=</span> gop.current_mode_info().resolution();
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">let</span> (width, height) <span style="color:#f92672">=</span> (width.min(<span style="color:#66d9ef">KIER_WIDTH</span>), height.min(<span style="color:#66d9ef">KIER_HEIGHT</span>));
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">let</span> <span style="color:#66d9ef">mut</span> buffer <span style="color:#f92672">=</span> RgbImage::new(width, height);
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">for</span> y <span style="color:#66d9ef">in</span> <span style="color:#ae81ff">0</span><span style="color:#f92672">..</span>height {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">for</span> x <span style="color:#66d9ef">in</span> <span style="color:#ae81ff">0</span><span style="color:#f92672">..</span>width {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">let</span> idx_r <span style="color:#f92672">=</span> ((y <span style="color:#f92672">*</span> <span style="color:#66d9ef">KIER_WIDTH</span>) <span style="color:#f92672">+</span> x) <span style="color:#f92672">*</span> <span style="color:#66d9ef">KIER_PIXEL_SIZE</span>;
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">let</span> pixel <span style="color:#f92672">=</span> <span style="color:#f92672">&</span><span style="color:#66d9ef">mut</span> buffer[(x, y)];
</span></span><span style="display:flex;"><span> pixel.red <span style="color:#f92672">=</span> <span style="color:#66d9ef">KIER</span>[idx_r];
</span></span><span style="display:flex;"><span> pixel.green <span style="color:#f92672">=</span> <span style="color:#66d9ef">KIER</span>[idx_r <span style="color:#f92672">+</span> <span style="color:#ae81ff">1</span>];
</span></span><span style="display:flex;"><span> pixel.blue <span style="color:#f92672">=</span> <span style="color:#66d9ef">KIER</span>[idx_r <span style="color:#f92672">+</span> <span style="color:#ae81ff">2</span>];
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span> buffer.write(<span style="color:#f92672">&</span><span style="color:#66d9ef">mut</span> gop)<span style="color:#f92672">?</span>;
</span></span><span style="display:flex;"><span> Ok(())
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>Not so bad! A bit tedious – we could solve some of this by turning
<code>KIER</code> into an <code>RgbImage</code> at compile-time using some clever <code>Cow</code> and
<code>const</code> tricks and implement blitting a sub-image of the image – but this
will do for now. This is a joke, after all, let’s not go nuts. All that’s
left with our code is for us to write our <code>main</code> function and try and boot
the thing!</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span><span style="color:#75715e">#[entry]</span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">fn</span> <span style="color:#a6e22e">main</span>() -> <span style="color:#a6e22e">Status</span> {
</span></span><span style="display:flex;"><span> uefi::helpers::init().unwrap();
</span></span><span style="display:flex;"><span> praise().unwrap();
</span></span><span style="display:flex;"><span> boot::stall(<span style="color:#ae81ff">100_000_000</span>);
</span></span><span style="display:flex;"><span> Status::<span style="color:#66d9ef">SUCCESS</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>If you’re following along at home and so interested, the final source is over at
<a href="https://gist.github.com/paultag/60334e9f6c06388cc4b1c2cf12d85085">gist.github.com</a>.
We can go ahead and build it using <code>cargo</code> (as is our tradition) by targeting
the UEFI platform.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-sh" data-lang="sh"><span style="display:flex;"><span>$ cargo build --release --target x86_64-unknown-uefi
</span></span></code></pre></div><h1 id="testing-the-uefi-blob">Testing the UEFI Blob</h1>
<p>While I can definitely get my machine to boot these blobs to test, I figured
I’d save myself some time by using QEMU to test without a full boot.
If you’ve not done this sort of thing before, we’ll need two packages,
<code>qemu</code> and <code>ovmf</code>. It’s a bit different than most invocations of qemu you
may see out there – so I figured it’d be worth writing this down, too.</p>
<aside class="left">
It's perhaps likely that you aren't using <code>doas</code> with Debian.
Replace <code>doas</code> with <code>sudo</code> if that's your thing.
</aside>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-sh" data-lang="sh"><span style="display:flex;"><span>$ doas apt install qemu-system-x86 ovmf
</span></span></code></pre></div><p><code>qemu</code> has a nice feature where it’ll create us an EFI partition as a drive and
attach it to the VM off a local directory – so let’s construct an EFI
partition file structure, and drop our binary into the conventional location.
If you haven’t done this before, and are only interested in running this in a
VM, don’t worry too much about it, a lot of it is convention and this layout
should work for you.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-sh" data-lang="sh"><span style="display:flex;"><span>$ mkdir -p esp/efi/boot
</span></span><span style="display:flex;"><span>$ cp target/x86_64-unknown-uefi/release/*.efi <span style="color:#ae81ff">\
</span></span></span><span style="display:flex;"><span><span style="color:#ae81ff"></span> esp/efi/boot/bootx64.efi
</span></span></code></pre></div><p>With all this in place, we can kick off <code>qemu</code>, booting it in UEFI mode using
the <code>ovmf</code> firmware, attaching our EFI partition directory as a drive to
our VM to boot off of.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-sh" data-lang="sh"><span style="display:flex;"><span>$ qemu-system-x86_64 <span style="color:#ae81ff">\
</span></span></span><span style="display:flex;"><span><span style="color:#ae81ff"></span> -enable-kvm <span style="color:#ae81ff">\
</span></span></span><span style="display:flex;"><span><span style="color:#ae81ff"></span> -m <span style="color:#ae81ff">2048</span> <span style="color:#ae81ff">\
</span></span></span><span style="display:flex;"><span><span style="color:#ae81ff"></span> -smbios type<span style="color:#f92672">=</span>0,uefi<span style="color:#f92672">=</span>on <span style="color:#ae81ff">\
</span></span></span><span style="display:flex;"><span><span style="color:#ae81ff"></span> -bios /usr/share/ovmf/OVMF.fd <span style="color:#ae81ff">\
</span></span></span><span style="display:flex;"><span><span style="color:#ae81ff"></span> -drive format<span style="color:#f92672">=</span>raw,file<span style="color:#f92672">=</span>fat:rw:esp
</span></span></code></pre></div><p>If all goes well, soon you’ll be met with the all knowing gaze of
Chosen One, Kier Eagan. The thing that really impressed me about all
this is this program worked first try – it all went so boringly
normal. Truly, kudos to the <code>uefi</code> crate maintainers, it’s incredibly
well done.</p>
<div>
<img src="https://notes.pault.ag/boot2kier/boot2kier.png" />
</div>
<h1 id="booting-a-live-system">Booting a live system</h1>
<p>Sure, we <em>could</em> stop here, but anyone can open up an app window and see a
picture of Kier Eagan, so I knew I needed to finish the job and boot a real
machine up with this. In order to do that, we need to format a USB stick.
<strong>BE SURE /dev/sda IS CORRECT IF YOU’RE COPY AND PASTING</strong>. All my drives
are NVMe, so <strong>BE CAREFUL</strong> – if you use SATA, it may very well be your
hard drive! Please do not destroy your computer over this.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-txt" data-lang="txt"><span style="display:flex;"><span>$ doas fdisk /dev/sda
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>Welcome to fdisk (util-linux 2.40.4).
</span></span><span style="display:flex;"><span>Changes will remain in memory only, until you decide to write them.
</span></span><span style="display:flex;"><span>Be careful before using the write command.
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>Command (m for help): n
</span></span><span style="display:flex;"><span>Partition type
</span></span><span style="display:flex;"><span> p primary (0 primary, 0 extended, 4 free)
</span></span><span style="display:flex;"><span> e extended (container for logical partitions)
</span></span><span style="display:flex;"><span>Select (default p): p
</span></span><span style="display:flex;"><span>Partition number (1-4, default 1):
</span></span><span style="display:flex;"><span>First sector (2048-4014079, default 2048):
</span></span><span style="display:flex;"><span>Last sector, +/-sectors or +/-size{K,M,G,T,P} (2048-4014079, default 4014079):
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>Created a new partition 1 of type 'Linux' and of size 1.9 GiB.
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>Command (m for help): t
</span></span><span style="display:flex;"><span>Selected partition 1
</span></span><span style="display:flex;"><span>Hex code or alias (type L to list all): ef
</span></span><span style="display:flex;"><span>Changed type of partition 'Linux' to 'EFI (FAT-12/16/32)'.
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>Command (m for help): w
</span></span><span style="display:flex;"><span>The partition table has been altered.
</span></span><span style="display:flex;"><span>Calling ioctl() to re-read partition table.
</span></span><span style="display:flex;"><span>Syncing disks.
</span></span></code></pre></div><p>Once that looks good (depending on your flavor of <code>udev</code> you may or
may not need to unplug and replug your USB stick), we can go ahead
and format our new EFI partition (<strong>BE CAREFUL THAT /dev/sda IS YOUR
USB STICK</strong>) and write our EFI directory to it.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-txt" data-lang="txt"><span style="display:flex;"><span>$ doas mkfs.fat /dev/sda1
</span></span><span style="display:flex;"><span>$ doas mount /dev/sda1 /mnt
</span></span><span style="display:flex;"><span>$ cp -r esp/efi /mnt
</span></span><span style="display:flex;"><span>$ find /mnt
</span></span><span style="display:flex;"><span>/mnt
</span></span><span style="display:flex;"><span>/mnt/efi
</span></span><span style="display:flex;"><span>/mnt/efi/boot
</span></span><span style="display:flex;"><span>/mnt/efi/boot/bootx64.efi
</span></span></code></pre></div><p>Of course, naturally, devotion to Kier shouldn’t mean backdooring your system.
Disabling Secure Boot runs counter to the Core Principals, such as Probity, and
not doing this would surely run counter to Verve, Wit and Vision. This bit does
require that you’ve taken the step to enroll a
<a href="https://wiki.debian.org/SecureBoot#MOK_-_Machine_Owner_Key">MOK</a> and know how
to use it, right about now is when we can use <code>sbsign</code> to sign our UEFI binary
we want to boot from to continue enforcing Secure Boot. The details for how
this command should be run specifically is likely something you’ll need to work
out depending on how you’ve decided to manage your MOK.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-sh" data-lang="sh"><span style="display:flex;"><span>$ doas sbsign <span style="color:#ae81ff">\
</span></span></span><span style="display:flex;"><span><span style="color:#ae81ff"></span> --cert /path/to/mok.crt <span style="color:#ae81ff">\
</span></span></span><span style="display:flex;"><span><span style="color:#ae81ff"></span> --key /path/to/mok.key <span style="color:#ae81ff">\
</span></span></span><span style="display:flex;"><span><span style="color:#ae81ff"></span> target/x86_64-unknown-uefi/release/*.efi <span style="color:#ae81ff">\
</span></span></span><span style="display:flex;"><span><span style="color:#ae81ff"></span> --output esp/efi/boot/bootx64.efi
</span></span></code></pre></div><p>I figured I’d leave a signed copy of <code>boot2kier</code> at
<code>/boot/efi/EFI/BOOT/KIER.efi</code> on my Dell XPS 13, with Secure Boot enabled
and enforcing, just took a matter of going into my BIOS to add the right
boot option, which was no sweat. I’m sure there is a way to do it using
<code>efibootmgr</code>, but I wasn’t smart enough to do that quickly. I let ’er rip,
and it booted up and worked great!</p>
<p>It was a bit hard to get a video of my laptop, though – but lucky for me, I
have a Minisforum Z83-F sitting around (which, until a few weeks ago was running
the annual http server to control my <a href="https://k3xec.com/christmas/">christmas tree</a>
) – so I grabbed it out of the christmas bin, wired it up to a video capture
card I have sitting around, and figured I’d grab a video of me booting a
physical device off the boot2kier USB stick.</p>
<div>
<img class="note-pad" src="https://notes.pault.ag/boot2kier/z83-boot2kier.gif" />
</div>
<p>Attentive readers will notice the image of Kier is smaller then the qemu booted
system – which just means our real machine has a larger GOP display
resolution than qemu, which makes sense! We could write some fancy resize code
(sounds annoying), center the image (can’t be assed but should be the easy way
out here) or resize the original image (pretty hardware specific workaround).
Additionally, you can make out the image being written to the display before us
(the Minisforum logo) behind Kier, which is really cool stuff. If we were real
fancy we could write blank pixels to the display before blitting Kier, but,
again, I don’t think I care to do <em>that</em> much work.</p>
<h1 id="but-now-i-must-away">But now I must away</h1>
<p>If I wanted to keep this joke going, I’d likely try and find a copy of the
original
<a href="https://www.youtube.com/watch?v=U6EUG22elbs">video when Helly 100%s her file</a>
and boot into that – or maybe play a terrible midi PC speaker rendition of
<a href="https://www.youtube.com/watch?v=OsbxAsdR0QI">Kier, Chosen One, Kier</a> after
rendering the image. I, unfortunately, don’t have any friends involved with
production (yet?), so I reckon all that’s out for now. I’ll likely stop playing
with this – the joke was done and I’m only writing this post because of how
great everything was along the way.</p>
<p>All in all, this reminds me so much of building a homebrew kernel to boot a
system into – but like, <em>good</em>, though, and it’s a nice reminder of both how
fun this stuff can be, and how far we’ve come. UEFI protocols are light-years
better than how we did it in the dark ages, and the tooling for this is <em>SO</em>
much more mature. Booting a custom UEFI binary is <em>miles</em> ahead of trying to
boot your own kernel, and I can’t believe how good the <code>uefi</code> crate is
specifically.</p>
<p>Praise Kier! Kudos, to everyone involved in making this so delightful ❤️.</p> Complex for Whom? https://notes.pault.ag/complex-for-whom/Tue, 12 Nov 2024 15:21:00 -0500 https://notes.pault.ag/complex-for-whom/ <p>In basically every engineering organization I’ve ever regarded as particularly
high functioning, I’ve sat through one specific recurring conversation which is
not – a conversation about “complexity”. Things are good or bad because they
are or aren’t complex, architectures needs to be redone because it’s too
complex – some refactor of whatever it is won’t work because it’s too complex.
You may have even been a part of some of these conversations – or even been
the one advocating for simple light-weight solutions. I’ve done it. Many times.</p>
<aside class="right">
When I was writing this, I had a flash-back to a over-10 year old
post by <code>mjg59</code> about
<a href="https://mjg59.dreamwidth.org/2414.html">LightDM</a>.
It would be a mistake not to link it here.
</aside>
<p>Rarely, if ever, do we talk about complexity within its rightful context –
complexity <strong>for whom</strong>. <u>Is a solution complex because it’s complex for the
end user? Is it complex if it’s complex for an API consumer? Is it complex if
it’s complex for the person maintaining the API service? Is it complex if it’s
complex for someone outside the team maintaining it to understand?</u>
Complexity within a problem domain I’ve come to believe, is fairly zero-sum –
there’s a fixed amount of complexity in the problem to be solved, and you can
choose to either solve it, or leave it for those downstream of you
to solve that problem on their own.</p>
<aside class="left">
Although I believe there is a fixed amount of complexity in the lower bound
of a problem, you always have the option to change the problem you're
solving!
</aside>
<p>That being said, while I believe there is a <em>lower</em> bound in complexity to
contend with for a problem, I do not believe there is an <em>upper</em> bound to the
complexity of solutions possible. It is always possible, and in fact, very
likely that teams create problems for themselves while trying to solve a
problem. The rest of this post is talking to the lower bound. When getting
feedback on an early draft of this blog post, I’ve been informed that Fred
Brooks coined a term for what I call “lower bound complexity” – “Essential
Complexity”, in the paper
“<a href="https://www.cs.unc.edu/techreports/86-020.pdf">No Silver Bullet—Essence and Accident in Software Engineering</a>”,
which is a better term and can be used interchangeably.</p>
<h1 id="complexity-culture">Complexity Culture</h1>
<p>In a large enough organization, where the team is high functioning enough to
have and maintain trust amongst peers, members of the team will specialize.
People will begin to engage with subsets of the work to be done, and begin to
have their efficacy measured against that part of the organization’s problems.
Incentives shift, and over time it becomes increasingly likely that two
engineers may have two very different priorities when working on the same
system together. Someone accountable for uptime and tasked with responding to
outages will begin to resist changes. Someone accountable for rapidly
delivering features will resist gates between them and their users. Companies
(either wittingly or unwittingly) will deal with this by tasking engineers with
both production (feature development) and operational tasks (maintenance), so
the difference in incentives isn’t usually as bad as it <em>could</em> be.</p>
<aside class="left">
The events depicted in this movie are fictitious. Any similarity to any
person living or dead is merely coincidental.
</aside>
<p>When we get a bunch of folks from far-flung corners of an organization in a
room, fire up a slide deck and throw up some aspirational to-be architecture
diagram in order to get a sign-off to solve some problem (be it someone needs a
credible promotion packet, new feature needs to get delivered, or the system
has begun to fail and needs fixing), the initial reaction will, more often than
I’d like, start to devolve into a discussion of how this is going to introduce
a bunch of complexity, going to be hard to maintain, why can’t you make it
<em>less complex</em>?</p>
<aside class="right">
In a high functioning environment, this is a mostly healthy impulse,
coming from a good place, and genuinely intended to prevent problems
for the whole organization by reducing non-essental complexity.
That is good. I'm talking about a conversation discussing removing
lower-limit complexity.
</aside>
<p>Right around here is when I start to try and contextualize the conversation
happening around me – understand what complexity is that being discussed, and
understand who is taking on that burden. Think about who <em>should</em> be owning
that problem, and work through the tradeoffs involved. Is it best solved here,
or left to consumers (be them other systems, developers, or users). Should
something become an API call’s optional param, taking on all the edge-cases and
on, or should users have to implement the logic using the data you
return (leaving everyone else to take on all the edge-cases and maintenance)?
Should you process the data, or require the user to preprocess it for you?</p>
<aside class="left">
<a href="https://layeraleph.com/">Carla Geisser</a> described this as being
reminicent of the technique outlined in
"<a href="https://web.mit.edu/saltzer/www/publications/endtoend/endtoend.pdf">end to end arguments in system design</a>",
which she uses to think about where complexity winds up in a system. It's
an extremely good parallel.
</aside>
<p>Frequently it’s right to make an active and explicit decision to simplify and
leave problems to be solved downstream, since they may not actually need to be
solved – or perhaps you expect consumers will want to own the specifics of
<em>how</em> the problem is solved, in which case you leave lots of documentation and
examples. Many other times, especially when it’s something downstream consumers
are likely to hit, it’s best solved internal to the system, since the only
thing that can come of leaving it unsolved are bugs, frustration and
half-correct solutions. This is a grey-space of tradeoffs, not a clear decision
tree. No one wants the software manifestation of a katamari ball or a junk
drawer, nor does anyone want a half-baked service unable to handle the simplest
use-case.</p>
<h1 id="head-in-sand-as-a-service">Head-in-sand as a Service</h1>
<p>Popoffs about how complex something is, are, to a first approximation, best
understood as meaning “complicated for the person making comments”. A lot of
the <code>#thoughtleadership</code> believe that an AWS hosted EKS <code>k8s</code> cluster running
images built by CI talking to an AWS hosted PostgreSQL RDS is not complex.
They’re right. Mostly right. This is less complex – less complex <em>for them</em>.
It’s not, however, without complexity and its own tradeoffs – it’s just
complexity that <strong>they do not have to deal with</strong>. Now they don’t have to
maintain machines that have pesky operating systems or hard drive failures.
They don’t have to deal with updating the version of <code>k8s</code>, nor ensuring the
backups work. No one has to push some artifact to prod manually. Deployments
happen unattended. You click a button and get a cluster.</p>
<p>On the other hand, developers outside the ops function need to deal with
troubleshooting CI, debugging access control rules encoded in turing complete
YAML, permissions issues inside the cluster due to whatever the fuck a service
mesh is, everyone needs to learn how to use some <code>k8s</code> tools they only actually
use during a bad day, likely while doing some <code>x.509</code> troubleshooting to
connect to the cluster (an internal only endpoint; just port forward it) – not
to mention all sorts of rules to route packets to their project (a single
repo’s binary being run in 3 containers on a single vm host).</p>
<aside class="right">
Truly I'm not picking on k8s here; I do genuinely believe it when I say
EKS is less complex for me to operate well; that's kinda the whole point.
</aside>
<p>Beyond that, there’s the invisible complexity – complexity on the interior of
a service you depend on. I think about the dozens of teams maintaining the EKS
service (which is either run on EC2 instances, or alternately, EC2 instances in
a trench coat, moustache and even more shell scripts), the RDS service (also
EC2 and shell scripts, but this time accounting for redundancy, backups,
availability zones), scores of hypervisors pulled off the shelf (<code>xen</code>, <code>kvm</code>)
smashed together with the ones built in-house (<code>firecracker</code>, <code>nitro</code>, etc)
running on hardware that has to be refreshed and maintained continuously. Every
request processed by network ACL rules, AWS IAM rules, security group rules,
using IP space announced to the internet wired through IXPs directly into ISPs.
I don’t even want to begin to think about the complexity inherent in how those
switches are designed. <em>Shitloads</em> of complexity to solve problems you may or
may not have, or even know you had.</p>
<aside class="left">
Do I care about invisible complexity? Generally, no. I don't. It's not my
problem and they don't show up to my meetings.
</aside>
<p><strong>What’s more complex? An app running in an in-house 4u server racked in the
office’s telco closet in the back running off the office Verizon line, or an
app running four hypervisors deep in an AWS datacenter? Which is more complex
<em>to you</em>? What about <em>to your organization</em>? <em>In total</em>? Which is more prone to
failure? Which is more secure? Is the complexity good or bad? What type of
Complexity can you manage effectively? Which threaten the system? Which
threaten your users?</strong></p>
<h1 id="complexivibes">COMPLEXIVIBES</h1>
<p>This extends beyond Engineering. Decisions regarding “what tools are we able to
use” – be them existing contracts with cloud providers, CIO mandated SaaS
products, a list of the only permissible open source projects – will incur
costs in terms of expressed “complexity”. Pinning open source projects to a
fixed set makes SBOM production “less complex”. Using only one SaaS provider’s
product suite (even if its terrible, because it has all the types of tools you
need) makes accreditation “less complex”. If all you have is a contract with
<em>Pauly T’s lowest price technically acceptable artisinal cloudary and
haberdashery</em>, the way you pay for your compute is “less complex” for the CIO
shop, though you will find yourself building your own hosted database template,
mechanism to spin up a k8s cluster, and all the operational and technical
burden that comes with it. Or you won’t and make it everyone else’s problem in
the organization. Nothing you can do will solve for the fact that you <em>must</em>
now deal with this problem <em>somewhere</em> because it was less complicated for the
business to put the workloads on the existing contract with a cut-rate vendor.</p>
<p>Suddenly, the decision to “reduce complexity” because of an existing contract
vehicle has resulted in a huge amount of technical risk and maintenance burden
being onboarded. Complexity you would otherwise externalize has now been taken
on internally. With large enough organizations (specifically, in this case,
I’m talking about you, bureaucracies), this is largely ignored or accepted as
normal since the personnel cost is understood to be free to everyone involved.
Doing it this way is more expensive, more work, less reliable and less
maintainable, and yet, somehow, is, in a lot of ways, “less complex” to the
organization. It’s particularly bad with bureaucracies, since screwing up a
contract will get you into much more trouble than delivering a broken product,
leaving basically no reason for anyone to care to fix this.</p>
<p>I can’t shake the feeling that for every story of <a href="https://mjw.wtf/weaver-a-tale-of-technical-policy.html">technical mandates gone
awry</a>, somewhere just
out of sight there’s a decisionmaker optimizing for what they believe to be the
least amount of complexity – least hassle, fewest unique cases, most
consistency – as they can. They freely offload complexity from their
accreditation and risk acceptance functions through mandates. They will never
have to deal with it. That does not change the fact that <em>someone does</em>.</p>
<h1 id="tcdr-too-complex-didnt-review">TC;DR (TOO COMPLEX; DIDN’T REVIEW)</h1>
<p>We wish to rid ourselves of systemic Complexity – after all, complexity is
bad, simplicity is good. Removing upper-bound own-goal complexity (“accidental
complexity” in Brooks’s terms) is important, but once you hit the lower bound
complexity, the tradeoffs become zero-sum. Removing complexity from one part of
the system means that somewhere else - maybe outside your organization or in a
non-engineering function - must grow it back. Sometimes, the opposite is the
case, such as when a previously manual business processes is automated. Maybe that’s a
good idea. Maybe it’s not. All I know is that what doesn’t help the situation
is conflating complexity with everything we don’t like – legacy code,
maintenance burden or toil, cost, delivery velocity.</p>
<ul>
<li><strong>Complexity is not the same as proclivity to failure.</strong> The most reliable
systems I’ve interacted with are unimaginably complex, with layers of internal
protection to prevent complete failure. This has its own set of costs which
other people <a href="https://how.complexsystems.fail/">have written about extensively</a>.</li>
<li><strong>Complexity is not cost.</strong> Sometimes the cost of taking all the complexity
in-house is less, for whatever value of cost you choose to use.</li>
<li><strong>Complexity is not absolute.</strong> Something simple from one perspective may
be wildly complex from another. The impulse to burn down complex sections of
code is helpful to have generally, but
<a href="https://en.wiktionary.org/wiki/Chesterton%27s_fence">sometimes things are complicated for a reason</a>,
even if that reason exists outside your codebase or organization.</li>
<li><strong>Complexity is not something you can remove without introducing complexity
elsewhere.</strong> Just as not making a decision is a decision itself; choosing to
require someone else to deal with a problem rather than dealing with it
internally is a choice that needs to be considered in its full context.</li>
</ul>
<aside class="left">
After reviewing an early draft of this post,
<a href="https://layeraleph.com/">Mikey Dickerson</a> described what
I was trying to say here back to me as "if you squeeze one part of
the water balloon it goes somewhere else", which is a metaphor I've
become attached to.
</aside>
<aside class="right">
Mikey also described these asides as being a Dr. Bronner's label,
which I'll own.
</aside>
<p>Next time you’re sitting through a discussion and someone starts to talk about
all the complexity about to be introduced, I want to pop up in the back of your
head, politely asking <em>what does complex mean in this context</em>? Is it lower
bound complexity? Is this complexity desirable? Is what they’re saying mean
something along the lines of I don’t understand the problems being solved, or
does it mean something along the lines of this problem <em>should</em> be solved
elsewhere? Do they believe this will result in more work for them in a way that
you don’t see? Should this not solved at all by changing the bounds of what we
should accept or redefine the understood limits of this system? Is the perceived
complexity a result of a decision elsewhere? Who’s taking this complexity on,
or more to the point, is failing to address complexity required by the problem
leaving it to others? Does it impact others? How specifically? What are you
not seeing?</p>
<p>What <em>can</em> change?</p>
<p><em>What should change</em>?</p> Domo Arigato, Mr. debugfs https://notes.pault.ag/debugfs/Sat, 13 Apr 2024 09:27:00 -0400 https://notes.pault.ag/debugfs/ <p>Years ago, at what I think I remember was DebConf 15, I hacked for a while
on debhelper to
<a href="https://github.com/Debian/debhelper/commit/5549f841fd7cba07e21df8e4f70b21c31cfb3da6">write build-ids to debian binary control files</a>,
so that the <code>build-id</code> (more specifically, the ELF note
<code>.note.gnu.build-id</code>) wound up in the Debian apt archive metadata.
I’ve always thought this was super cool, and seeing as how Michael Stapelberg
<a href="https://michael.stapelberg.ch/posts/2019-02-15-debian-debugging-devex/">blogged</a>
some great pointers around the ecosystem, including the fancy new <code>debuginfod</code>
service, and the
<a href="https://manpages.debian.org/testing/debian-goodies/find-dbgsym-packages.1.en.html">find-dbgsym-packages</a>
helper, which uses these same headers, I don’t think I’m the only one.</p>
<p>At work I’ve been using a lot of <a href="https://www.rust-lang.org/">rust</a>,
specifically, async rust using <a href="https://tokio.rs/">tokio</a>. To try and work on
my style, and to dig deeper into the how and why of the decisions made in these
frameworks, I’ve decided to hack up a project that I’ve wanted to do ever
since 2015 – write a debug filesystem. Let’s get to it.</p>
<h1 id="back-to-the-future">Back to the Future</h1>
<aside class="left">
It shouldn't shock anyone to learn I'm a huge fan of Go, right?
</aside>
<p>Time to admit something. I really love <a href="https://9front.org/">Plan 9</a>. It’s
just so good. So many ideas from Plan 9 are just so prescient, and everything
just feels <em>right</em>. Not just right like, feels good – like, <em>correct</em>. The
bit that I’ve always liked the most is <code>9p</code>, the network protocol for serving
a filesystem over a network. This leads to all sorts of fun programs, like the
Plan 9 <code>ftp</code> client being a 9p server – you mount the ftp server and access
files like any other files. It’s kinda like if fuse were more fully a part
of how the operating system worked, but fuse is all running client-side. With
9p there’s a single client, and different <em>servers</em> that you can connect to,
which may be backed by a hard drive, remote resources over something like SFTP, FTP, HTTP or even purely synthetic.</p>
<aside class="right">
I even triggered a weird bug in
<a href="https://github.com/vim/vim/commit/14759ded57447345ba11c11a99fd84344797862c">vim</a>
when writing a 9p filesystem that wound up impacting
<a href="https://github.com/microsoft/WSL/issues/11256">WSL</a>
-- although it seems like maybe not due to 9p (rather, SMB)
</aside>
<p>The interesting (maybe sad?) part here is that 9p wound up outliving Plan 9
in terms of adoption – <code>9p</code> is in all sorts of places folks don’t usually expect.
For instance, the Windows Subsystem for Linux uses the 9p protocol to share
files between Windows and Linux. ChromeOS uses it to share files with Crostini,
and qemu uses 9p (<code>virtio-p9</code>) to share files between guest and host. If you’re
noticing a pattern here, you’d be right; for some reason 9p is the go-to protocol
to exchange files between hypervisor and guest. Why? I have no idea, except maybe
due to being designed well, simple to implement, and it’s a lot easier to validate the data being shared
and validate security boundaries. Simplicity has its value.</p>
<p>As a result, there’s a <em>lot</em> of lingering 9p support kicking around. Turns out
Linux can even handle mounting 9p filesystems out of the box. This means that I
can deploy a filesystem to my LAN or my <code>localhost</code> by running a process on top
of a computer that needs nothing special, and mount it over the network on an
unmodified machine – unlike <code>fuse</code>, where you’d need client-specific software
to run in order to mount the directory. For instance, let’s mount a 9p
filesystem running on my localhost machine, serving requests on <code>127.0.0.1:564</code>
(tcp) that goes by the name “<code>mountpointname</code>” to <code>/mnt</code>.</p>
<aside class="left">
Unfortunately, this requires root to mount and feels very un-plan9,
but it does work and the protocol is good.
</aside>
<pre>
$ mount -t 9p \
-o trans=tcp,port=564,version=9p2000.u,aname=mountpointname \
127.0.0.1 \
/mnt
</pre>
<p>Linux will mount away, and attach to the filesystem as the root user, and by default,
attach to that mountpoint again for each local user that attempts to use
it. Nifty, right? I think so. The server is able
to keep track of per-user access and authorization
along with the host OS.</p>
<h1 id="wherein-i-styx-with-it">WHEREIN I STYX WITH IT</h1>
<aside class="right">
"Simple" here is intended as my highest form of praise. Writing complex
things is easy. Taking your work, and simplifying it down the core
is the most difficult part of our work.
</aside>
<p>Since I wanted to push myself a bit more with <code>rust</code> and <code>tokio</code> specifically,
I opted to implement the whole stack myself, without third party libraries on
the critical path where I could avoid it. The 9p protocol (sometimes called
<code>Styx</code>, the original name for it) is incredibly simple. It’s a series of client
to server requests, which receive a server to client response. These are,
respectively, “<code>T</code>” messages, which <code>t</code>ransmit a request to the server, which
trigger an “<code>R</code>” message in response (<code>R</code>eply messages). These messages are
<a href="https://en.wikipedia.org/wiki/Type%E2%80%93length%E2%80%93value">TLV</a> payload
with a very straight forward structure – so straight forward, in fact, that I
was able to implement a working server off nothing more than a handful of <a href="https://9fans.github.io/plan9port/man/man9/">man
pages</a>.</p>
<aside class="left">
There's also a <code>9P2000.L</code> 9p variant which has more
Linux specific extensions. There's a good chance I port this
forward when I get the chance.
</aside>
<p>Later on after the basics worked, I found a more complete
<a href="https://ericvh.github.io/9p-rfc/rfc9p2000.html">spec page</a>
that contains more information about the
<a href="https://ericvh.github.io/9p-rfc/rfc9p2000.u.html">unix specific variant</a>
that I opted to use (<code>9P2000.u</code> rather than <code>9P2000</code>) due to the level
of <code>Linux</code> specific support for the <code>9P2000.u</code> variant over the <code>9P2000</code>
protocol.</p>
<h1 id="mr-roboto">MR ROBOTO</h1>
<aside class="right">
It really bothers me rust libraries that deal with I/O need to support
std::io, but to add support for async runtimes, you need to implement
support for tokio::io and every other runtime; but them's the breaks I
guess. I really miss Go's built-in async support and io module.
</aside>
<p>The backend stack over at <a href="https://zoo.dev">zoo</a> is <code>rust</code> and <code>tokio</code>
running i/o for an <code>HTTP</code> and <code>WebRTC</code> server. I figured I’d pick something
fairly similar to write my filesystem with, since <code>9P</code> can be implemented
on basically anything with I/O. That means <code>tokio</code> tcp server bits, which
construct and use a <code>9p</code> server, which has an idiomatic Rusty API that
partially abstracts the raw <code>R</code> and <code>T</code> messages, but not so much as to
cause issues with hiding implementation possibilities. At each abstraction
level, there’s an escape hatch – allowing someone to implement any of
the layers if required. I called this framework
<a href="https://github.com/paultag/arigato">arigato</a> which can be found over on
<a href="https://docs.rs/arigato">docs.rs</a> and
<a href="https://crates.io/crates/arigato">crates.io</a>.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span><span style="color:#e6db74">/// Simplified version of the arigato File trait; this isn't actually
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">/// the same trait; there's some small cosmetic differences. The
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">/// actual trait can be found at:
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">///
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">/// https://docs.rs/arigato/latest/arigato/server/trait.File.html
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74"></span><span style="color:#66d9ef">trait</span> File {
</span></span><span style="display:flex;"><span> <span style="color:#e6db74">/// OpenFile is the type returned by this File via an Open call.
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74"></span> <span style="color:#66d9ef">type</span> <span style="color:#a6e22e">OpenFile</span>: <span style="color:#a6e22e">OpenFile</span>;
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> <span style="color:#e6db74">/// Return the 9p Qid for this file. A file is the same if the Qid is
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74"></span> <span style="color:#e6db74">/// the same. A Qid contains information about the mode of the file,
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74"></span> <span style="color:#e6db74">/// version of the file, and a unique 64 bit identifier.
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74"></span> <span style="color:#66d9ef">fn</span> <span style="color:#a6e22e">qid</span>(<span style="color:#f92672">&</span>self) -> <span style="color:#a6e22e">Qid</span>;
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> <span style="color:#e6db74">/// Construct the 9p Stat struct with metadata about a file.
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74"></span> <span style="color:#66d9ef">async</span> <span style="color:#66d9ef">fn</span> <span style="color:#a6e22e">stat</span>(<span style="color:#f92672">&</span>self) -> <span style="color:#a6e22e">FileResult</span><span style="color:#f92672"><</span>Stat<span style="color:#f92672">></span>;
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> <span style="color:#e6db74">/// Attempt to update the file metadata.
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74"></span> <span style="color:#66d9ef">async</span> <span style="color:#66d9ef">fn</span> <span style="color:#a6e22e">wstat</span>(<span style="color:#f92672">&</span><span style="color:#66d9ef">mut</span> self, s: <span style="color:#66d9ef">&</span><span style="color:#a6e22e">Stat</span>) -> <span style="color:#a6e22e">FileResult</span><span style="color:#f92672"><</span>()<span style="color:#f92672">></span>;
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> <span style="color:#e6db74">/// Traverse the filesystem tree.
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74"></span> <span style="color:#66d9ef">async</span> <span style="color:#66d9ef">fn</span> <span style="color:#a6e22e">walk</span>(<span style="color:#f92672">&</span>self, path: <span style="color:#66d9ef">&</span>[<span style="color:#f92672">&</span><span style="color:#66d9ef">str</span>]) -> <span style="color:#a6e22e">FileResult</span><span style="color:#f92672"><</span>(Option<span style="color:#f92672"><</span>Self<span style="color:#f92672">></span>, Vec<span style="color:#f92672"><</span>Self<span style="color:#f92672">></span>)<span style="color:#f92672">></span>;
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> <span style="color:#e6db74">/// Request that a file's reference be removed from the file tree.
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74"></span> <span style="color:#66d9ef">async</span> <span style="color:#66d9ef">fn</span> <span style="color:#a6e22e">unlink</span>(<span style="color:#f92672">&</span><span style="color:#66d9ef">mut</span> self) -> <span style="color:#a6e22e">FileResult</span><span style="color:#f92672"><</span>()<span style="color:#f92672">></span>;
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> <span style="color:#e6db74">/// Create a file at a specific location in the file tree.
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74"></span> <span style="color:#66d9ef">async</span> <span style="color:#66d9ef">fn</span> <span style="color:#a6e22e">create</span>(
</span></span><span style="display:flex;"><span> <span style="color:#f92672">&</span><span style="color:#66d9ef">mut</span> self,
</span></span><span style="display:flex;"><span> name: <span style="color:#66d9ef">&</span><span style="color:#66d9ef">str</span>,
</span></span><span style="display:flex;"><span> perm: <span style="color:#66d9ef">u16</span>,
</span></span><span style="display:flex;"><span> ty: <span style="color:#a6e22e">FileType</span>,
</span></span><span style="display:flex;"><span> mode: <span style="color:#a6e22e">OpenMode</span>,
</span></span><span style="display:flex;"><span> extension: <span style="color:#66d9ef">&</span><span style="color:#66d9ef">str</span>,
</span></span><span style="display:flex;"><span> ) -> <span style="color:#a6e22e">FileResult</span><span style="color:#f92672"><</span>Self<span style="color:#f92672">></span>;
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> <span style="color:#e6db74">/// Open the File, returning a handle to the open file, which handles
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74"></span> <span style="color:#e6db74">/// file i/o. This is split into a second type since it is genuinely
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74"></span> <span style="color:#e6db74">/// unrelated -- and the fact that a file is Open or Closed can be
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74"></span> <span style="color:#e6db74">/// handled by the `arigato` server for us.
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74"></span> <span style="color:#66d9ef">async</span> <span style="color:#66d9ef">fn</span> <span style="color:#a6e22e">open</span>(<span style="color:#f92672">&</span><span style="color:#66d9ef">mut</span> self, mode: <span style="color:#a6e22e">OpenMode</span>) -> <span style="color:#a6e22e">FileResult</span><span style="color:#f92672"><</span>Self::OpenFile<span style="color:#f92672">></span>;
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#e6db74">/// Simplified version of the arigato OpenFile trait; this isn't actually
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">/// the same trait; there's some small cosmetic differences. The
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">/// actual trait can be found at:
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">///
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">/// https://docs.rs/arigato/latest/arigato/server/trait.OpenFile.html
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74"></span><span style="color:#66d9ef">trait</span> OpenFile {
</span></span><span style="display:flex;"><span> <span style="color:#e6db74">/// iounit to report for this file. The iounit reported is used for Read
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74"></span> <span style="color:#e6db74">/// or Write operations to signal, if non-zero, the maximum size that is
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74"></span> <span style="color:#e6db74">/// guaranteed to be transferred atomically.
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74"></span> <span style="color:#66d9ef">fn</span> <span style="color:#a6e22e">iounit</span>(<span style="color:#f92672">&</span>self) -> <span style="color:#66d9ef">u32</span>;
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> <span style="color:#e6db74">/// Read some number of bytes up to `buf.len()` from the provided
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74"></span> <span style="color:#e6db74">/// `offset` of the underlying file. The number of bytes read is
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74"></span> <span style="color:#e6db74">/// returned.
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74"></span> <span style="color:#66d9ef">async</span> <span style="color:#66d9ef">fn</span> <span style="color:#a6e22e">read_at</span>(
</span></span><span style="display:flex;"><span> <span style="color:#f92672">&</span><span style="color:#66d9ef">mut</span> self,
</span></span><span style="display:flex;"><span> buf: <span style="color:#66d9ef">&</span><span style="color:#a6e22e">mut</span> [<span style="color:#66d9ef">u8</span>],
</span></span><span style="display:flex;"><span> offset: <span style="color:#66d9ef">u64</span>,
</span></span><span style="display:flex;"><span> ) -> <span style="color:#a6e22e">FileResult</span><span style="color:#f92672"><</span><span style="color:#66d9ef">u32</span><span style="color:#f92672">></span>;
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> <span style="color:#e6db74">/// Write some number of bytes up to `buf.len()` from the provided
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74"></span> <span style="color:#e6db74">/// `offset` of the underlying file. The number of bytes written
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74"></span> <span style="color:#e6db74">/// is returned.
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74"></span> <span style="color:#66d9ef">fn</span> <span style="color:#a6e22e">write_at</span>(
</span></span><span style="display:flex;"><span> <span style="color:#f92672">&</span><span style="color:#66d9ef">mut</span> self,
</span></span><span style="display:flex;"><span> buf: <span style="color:#66d9ef">&</span><span style="color:#a6e22e">mut</span> [<span style="color:#66d9ef">u8</span>],
</span></span><span style="display:flex;"><span> offset: <span style="color:#66d9ef">u64</span>,
</span></span><span style="display:flex;"><span> ) -> <span style="color:#a6e22e">FileResult</span><span style="color:#f92672"><</span><span style="color:#66d9ef">u32</span><span style="color:#f92672">></span>;
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><h1 id="thanks-decade-ago-paultag">Thanks, decade ago paultag!</h1>
<aside class="left">
If this isn't my record for longest idea-to-wip-project time, it's close.
</aside>
<p>Let’s do it! Let’s use <code>arigato</code> to implement a <code>9p</code> filesystem we’ll call
<a href="https://github.com/paultag/debugfs">debugfs</a> that will serve all the debug
files shipped according to the <code>Packages</code> metadata from the <code>apt</code> archive. We’ll
fetch the <code>Packages</code> file and construct a filesystem based on the reported
<code>Build-Id</code> entries. For those who don’t know much about how an <code>apt</code> repo
works, here’s the 2-second crash course on what we’re doing. The first is to
fetch the <code>Packages</code> file, which is specific to a binary architecture (such as
<code>amd64</code>, <code>arm64</code> or <code>riscv64</code>). That <code>architecture</code> is specific to a
<code>component</code> (such as <code>main</code>, <code>contrib</code> or <code>non-free</code>). That <code>component</code> is
specific to a <code>suite</code>, such as <code>stable</code>, <code>unstable</code> or any of its aliases
(<code>bullseye</code>, <code>bookworm</code>, etc). Let’s take a look at the <code>Packages.xz</code> file for
the <code>unstable-debug</code> <code>suite</code>, <code>main</code> <code>component</code>, for all <code>amd64</code> binaries.</p>
<pre tabindex="0"><code>$ curl \
https://deb.debian.org/debian-debug/dists/unstable-debug/main/binary-amd64/Packages.xz \
| unxz
</code></pre><p>This will return the Debian-style
<a href="https://man7.org/linux/man-pages/man5/deb822.5.html">rfc2822-like</a> headers,
which is an export of the metadata contained inside each <code>.deb</code> file which
<code>apt</code> (or other tools that can use the <code>apt</code> repo format) use to fetch
information about debs. Let’s take a look at the debug headers for the
<code>netlabel-tools</code> package in <code>unstable</code> – which is a package named
<code>netlabel-tools-dbgsym</code> in <code>unstable-debug</code>.</p>
<pre tabindex="0"><code>Package: netlabel-tools-dbgsym
Source: netlabel-tools (0.30.0-1)
Version: 0.30.0-1+b1
Installed-Size: 79
Maintainer: Paul Tagliamonte <paultag@debian.org>
Architecture: amd64
Depends: netlabel-tools (= 0.30.0-1+b1)
Description: debug symbols for netlabel-tools
Auto-Built-Package: debug-symbols
Build-Ids: e59f81f6573dadd5d95a6e4474d9388ab2777e2a
Description-md5: a0e587a0cf730c88a4010f78562e6db7
Section: debug
Priority: optional
Filename: pool/main/n/netlabel-tools/netlabel-tools-dbgsym_0.30.0-1+b1_amd64.deb
Size: 62776
SHA256: 0e9bdb087617f0350995a84fb9aa84541bc4df45c6cd717f2157aa83711d0c60
</code></pre><p>So here, we can parse the package headers in the <code>Packages.xz</code> file, and store,
for each <code>Build-Id</code>, the <code>Filename</code> where we can fetch the <code>.deb</code> at. Each
<code>.deb</code> contains a number of files – but we’re only really interested in the
files inside the <code>.deb</code> located at or under <code>/usr/lib/debug/.build-id/</code>,
which you can find in <code>debugfs</code> under
<a href="https://github.com/paultag/debugfs/blob/main/src/deb822.rs">rfc822.rs</a>. It’s
crude, and very single-purpose, but I’m feeling a bit lazy.</p>
<h1 id="who-needs-dpkg">Who needs dpkg?!</h1>
<aside class="right">
Hilariously, the fourth? fifth? non-serious time (second serious time)
I've had to do this for a new language.
</aside>
<p>For folks who haven’t seen it yet, a <code>.deb</code> file is a special type of
<a href="https://en.wikipedia.org/wiki/Ar_(Unix)">.ar</a> file, that contains (usually)
three files inside – <code>debian-binary</code>, <code>control.tar.xz</code> and <code>data.tar.xz</code>.
The core of an <code>.ar</code> file is a fixed size (<code>60 byte</code>) entry header,
followed by the specified <code>size</code> number of bytes.</p>
<pre tabindex="0"><code>[8 byte .ar file magic]
[60 byte entry header]
[N bytes of data]
[60 byte entry header]
[N bytes of data]
[60 byte entry header]
[N bytes of data]
...
</code></pre><aside class="left">
I can't believe it's already been over a decade since my NM process,
and nearly 16 years since I became an Ubuntu member.
</aside>
<p>First up was to implement a basic <code>ar</code> parser in
<a href="https://github.com/paultag/debugfs/blob/main/src/ar.rs">ar.rs</a>. Before we get
into using it to parse a deb, as a quick diversion, let’s break apart a <code>.deb</code>
file by hand – something that is a bit of a rite of passage (or at least it
used to be? I’m getting old) during the Debian nm (new member) process, to take
a look at where exactly the <code>.debug</code> file lives inside the <code>.deb</code> file.</p>
<pre tabindex="0"><code>$ ar x netlabel-tools-dbgsym_0.30.0-1+b1_amd64.deb
$ ls
control.tar.xz debian-binary
data.tar.xz netlabel-tools-dbgsym_0.30.0-1+b1_amd64.deb
$ tar --list -f data.tar.xz | grep '.debug$'
./usr/lib/debug/.build-id/e5/9f81f6573dadd5d95a6e4474d9388ab2777e2a.debug
</code></pre><p>Since we know quite a bit about the structure of a <code>.deb</code> file, and I had to
implement support from scratch anyway, I opted to implement a (very!) basic
debfile parser using HTTP Range requests. HTTP Range requests, if supported by
the server (denoted by a <code>accept-ranges: bytes</code> HTTP header in response to an
HTTP <code>HEAD</code> request to that file) means that we can add a header such as
<code>range: bytes=8-68</code> to specifically request that the returned <code>GET</code> body be the
byte range provided (in the above case, the bytes starting from byte offset <code>8</code>
until byte offset <code>68</code>). This means we can fetch just the ar file entry from
the <code>.deb</code> file until we get to the file inside the <code>.deb</code> we are interested in
(in our case, the <code>data.tar.xz</code> file) – at which point we can request the body
of that file with a final <code>range</code> request. I wound up writing a struct to
handle a <code>read_at</code>-style API surface in
<a href="https://github.com/paultag/debugfs/blob/main/src/hrange.rs">hrange.rs</a>, which
we can pair with <code>ar.rs</code> above and start to find our data in the <code>.deb</code> remotely
without downloading and unpacking the <code>.deb</code> at all.</p>
<aside class="right">
I really like
<a href="https://developer.mozilla.org/en-US/docs/Web/HTTP/Range_requests">HTTP Range</a>
requests a lot.
</aside>
<aside class="left">
I did some stats to figure out what compression dbgsym packages use these
days; my LAN debug mirror contains 113459 xz compressed tarfiles, and 9
gzip compressed tarfiles at the time of writing.
</aside>
<p>After we have the body of the <code>data.tar.xz</code> coming back through the HTTP
response, we get to pipe it through an <code>xz</code> decompressor (this kinda sucked in
Rust, since a <code>tokio</code> <code>AsyncRead</code> is not the same as an <code>http</code> Body response is
not the same as <code>std::io::Read</code>, is not the same as an async (or sync)
<code>Iterator</code> is not the same as what the <code>xz2</code> crate expects; leading me to read
blocks of data to a buffer and stuff them through the decoder by looping over
the buffer for each <code>lzma2</code> packet in a loop), and <code>tar</code>file parser (similarly
troublesome). From there we get to iterate over all entries in the tarfile,
stopping when we reach our file of interest. Since we can’t seek, but <code>gdb</code>
needs to, we’ll pull it out of the stream into a <code>Cursor<Vec<u8>></code> in-memory
and pass a handle to it back to the user.</p>
<p>From here on out its a matter of
<a href="https://github.com/paultag/debugfs/blob/main/src/debugfs.rs">gluing together a File traited struct</a>
in <code>debugfs</code>, and serving the filesystem over TCP using <code>arigato</code>. Done
deal!</p>
<h1 id="a-quick-diversion-about-compression">A quick diversion about compression</h1>
<p>I was originally hoping to avoid transferring the whole tar file over the
network (and therefore also reading the whole debug file into ram, which
objectively sucks), but quickly hit issues with figuring out a way around
seeking around an <code>xz</code> file. What’s interesting is <code>xz</code> has a great primitive
to solve this specific problem (specifically, use a block size that allows you
to seek to the block as close to your desired seek position just before it,
only discarding at most <code>block size - 1</code> bytes), but <code>data.tar.xz</code> files
generated by <code>dpkg</code> appear to have a single mega-huge block for the whole file.
I don’t know why I would have expected any different, in retrospect. That means
that this now devolves into the base case of “How do I seek around an <code>lzma2</code>
compressed data stream”; which is a lot more complex of a question.</p>
<aside class="left">
After going through a lot of this, I realized just how complex
the xz format is -- it's a lot more than just lzma2!
</aside>
<p>Thankfully, notoriously brilliant <a href="https://github.com/tianon">tianon</a> was
nice enough to introduce me to <a href="https://github.com/jonjohnsonjr">Jon Johnson</a>
who did something super similar – adapted a technique to seek inside a
compressed <code>gzip</code> file, which lets his service
<a href="https://oci.dag.dev/?image=debian%3Aunstable">oci.dag.dev</a>
seek through Docker container images super fast based on some prior work
such as <code>soci-snapshotter</code>, <code>gztool</code>, and
<a href="https://github.com/madler/zlib/blob/0f51fb4933fc9ce18199cb2554dacea8033e7fd3/examples/zran.c">zran.c</a>.
He also pulled this party trick off for apk based distros
over at <a href="https://apk.dag.dev/">apk.dag.dev</a>, which seems apropos.
Jon was nice enough to publish a lot of his work on this specifically in a
central place under the name “<a href="https://github.com/jonjohnsonjr/targz">targz</a>”
on his GitHub, which has been a ton of fun to read through.</p>
<p>The gist is that, by dumping the decompressor’s state (window of previous
bytes, in-memory data derived from the last <code>N-1 bytes</code>) at specific
“checkpoints” along with the compressed data stream offset in bytes and
decompressed offset in bytes, one can seek to that checkpoint in the compressed
stream and pick up where you left off – creating a similar “block” mechanism
against the wishes of gzip. It means you’d need to do an <code>O(n)</code> run over the
file, but every request after that will be sped up according to the number
of checkpoints you’ve taken.</p>
<p>Given the complexity of <code>xz</code> and <code>lzma2</code>, I don’t think this is possible
for me at the moment – especially given most of the files I’ll be requesting
will not be loaded from again – especially when I can “just” cache the debug
header by <code>Build-Id</code>. I want to implement this (because I’m generally curious
and Jon has a way of getting someone excited about compression schemes, which
is not a sentence I thought I’d ever say out loud), but for now I’m going to
move on without this optimization. Such a shame, since it kills a lot of the
work that went into seeking around the <code>.deb</code> file in the first place, given
the <code>debian-binary</code> and <code>control.tar.gz</code> members are so small.</p>
<h1 id="the-good">The Good</h1>
<p>First, the good news right? It works! That’s pretty cool. I’m positive
my younger self would be amused and happy to see this working; as is
current day paultag. Let’s take <code>debugfs</code> out for a spin! First, we need
to mount the filesystem. It even works on an entirely unmodified, stock
Debian box on my LAN, which is <em>huge</em>. Let’s take it for a spin:</p>
<pre tabindex="0"><code>$ mount \
-t 9p \
-o trans=tcp,version=9p2000.u,aname=unstable-debug \
192.168.0.2 \
/usr/lib/debug/.build-id/
</code></pre><p>And, let’s prove to ourselves that this actually mounted before we go
trying to use it:</p>
<pre tabindex="0"><code>$ mount | grep build-id
192.168.0.2 on /usr/lib/debug/.build-id type 9p (rw,relatime,aname=unstable-debug,access=user,trans=tcp,version=9p2000.u,port=564)
</code></pre><p>Slick. We’ve got an open connection to the server, where our host
will keep a connection alive as root, attached to the filesystem provided
in <code>aname</code>. Let’s take a look at it.</p>
<pre tabindex="0"><code>$ ls /usr/lib/debug/.build-id/
00 0d 1a 27 34 41 4e 5b 68 75 82 8E 9b a8 b5 c2 CE db e7 f3
01 0e 1b 28 35 42 4f 5c 69 76 83 8f 9c a9 b6 c3 cf dc E7 f4
02 0f 1c 29 36 43 50 5d 6a 77 84 90 9d aa b7 c4 d0 dd e8 f5
03 10 1d 2a 37 44 51 5e 6b 78 85 91 9e ab b8 c5 d1 de e9 f6
04 11 1e 2b 38 45 52 5f 6c 79 86 92 9f ac b9 c6 d2 df ea f7
05 12 1f 2c 39 46 53 60 6d 7a 87 93 a0 ad ba c7 d3 e0 eb f8
06 13 20 2d 3a 47 54 61 6e 7b 88 94 a1 ae bb c8 d4 e1 ec f9
07 14 21 2e 3b 48 55 62 6f 7c 89 95 a2 af bc c9 d5 e2 ed fa
08 15 22 2f 3c 49 56 63 70 7d 8a 96 a3 b0 bd ca d6 e3 ee fb
09 16 23 30 3d 4a 57 64 71 7e 8b 97 a4 b1 be cb d7 e4 ef fc
0a 17 24 31 3e 4b 58 65 72 7f 8c 98 a5 b2 bf cc d8 E4 f0 fd
0b 18 25 32 3f 4c 59 66 73 80 8d 99 a6 b3 c0 cd d9 e5 f1 fe
0c 19 26 33 40 4d 5a 67 74 81 8e 9a a7 b4 c1 ce da e6 f2 ff
</code></pre><p>Outstanding. Let’s try using <code>gdb</code> to debug a binary that was provided by
the <code>Debian</code> archive, and see if it’ll load the ELF by <code>build-id</code> from the
right <code>.deb</code> in the <code>unstable-debug</code> suite:</p>
<pre tabindex="0"><code>$ gdb -q /usr/sbin/netlabelctl
Reading symbols from /usr/sbin/netlabelctl...
Reading symbols from /usr/lib/debug/.build-id/e5/9f81f6573dadd5d95a6e4474d9388ab2777e2a.debug...
(gdb)
</code></pre><p>Yes! Yes it will!</p>
<pre tabindex="0"><code>$ file /usr/lib/debug/.build-id/e5/9f81f6573dadd5d95a6e4474d9388ab2777e2a.debug
/usr/lib/debug/.build-id/e5/9f81f6573dadd5d95a6e4474d9388ab2777e2a.debug: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter *empty*, BuildID[sha1]=e59f81f6573dadd5d95a6e4474d9388ab2777e2a, for GNU/Linux 3.2.0, with debug_info, not stripped
</code></pre><h1 id="the-bad">The Bad</h1>
<p>Linux’s support for <code>9p</code> is mainline, which is great, but it’s not robust.
Network issues or server restarts will wedge the mountpoint (Linux can’t
reconnect when the tcp connection breaks), and things that work fine on local
filesystems get translated in a way that causes a lot of network chatter – for
instance, just due to the way the syscalls are translated, doing an <code>ls</code>, will
result in a <code>stat</code> call for each file in the directory, even though linux had
just got a <code>stat</code> entry for every file while it was resolving directory names.
On top of that, Linux will serialize all I/O with the server, so there’s no
concurrent requests for file information, writes, or reads pending at the same
time to the server; and <code>read</code> and <code>write</code> throughput will degrade as latency
increases due to increasing round-trip time, even though there are offsets
included in the <code>read</code> and <code>write</code> calls. It works well enough, but is
frustrating to run up against, since there’s not a lot you can do server-side
to help with this beyond implementing the <code>9P2000.L</code> variant (which, maybe is
worth it).</p>
<h1 id="the-ugly">The Ugly</h1>
<p>Unfortunately, we don’t know the file size(s) until we’ve actually opened the
underlying <code>tar</code> file and found the correct member, so for most files, we don’t
know the real size to report when getting a <code>stat</code>. We can’t parse the tarfiles
for every <code>stat</code> call, since that’d make <code>ls</code> even slower (bummer). Only
hiccup is that when I report a filesize of zero, <code>gdb</code> throws a bit of a
fit; let’s try with a size of <code>0</code> to start:</p>
<pre tabindex="0"><code>$ ls -lah /usr/lib/debug/.build-id/e5/9f81f6573dadd5d95a6e4474d9388ab2777e2a.debug
-r--r--r-- 1 root root 0 Dec 31 1969 /usr/lib/debug/.build-id/e5/9f81f6573dadd5d95a6e4474d9388ab2777e2a.debug
$ gdb -q /usr/sbin/netlabelctl
Reading symbols from /usr/sbin/netlabelctl...
Reading symbols from /usr/lib/debug/.build-id/e5/9f81f6573dadd5d95a6e4474d9388ab2777e2a.debug...
warning: Discarding section .note.gnu.build-id which has a section size (24) larger than the file size [in module /usr/lib/debug/.build-id/e5/9f81f6573dadd5d95a6e4474d9388ab2777e2a.debug]
[...]
</code></pre><p>This obviously won’t work since <code>gdb</code> will throw away all our hard work because
of <code>stat</code>’s output, and neither will loading the real size of the underlying
file. That only leaves us with hardcoding a file size and hope nothing else
breaks significantly as a result. Let’s try it again:</p>
<pre tabindex="0"><code>$ ls -lah /usr/lib/debug/.build-id/e5/9f81f6573dadd5d95a6e4474d9388ab2777e2a.debug
-r--r--r-- 1 root root 954M Dec 31 1969 /usr/lib/debug/.build-id/e5/9f81f6573dadd5d95a6e4474d9388ab2777e2a.debug
$ gdb -q /usr/sbin/netlabelctl
Reading symbols from /usr/sbin/netlabelctl...
Reading symbols from /usr/lib/debug/.build-id/e5/9f81f6573dadd5d95a6e4474d9388ab2777e2a.debug...
(gdb)
</code></pre><p>Much better. I mean, terrible but better. Better for now, anyway.</p>
<h1 id="kilroy-was-here">Kilroy was here</h1>
<p>Do I think this is a particularly good idea? I mean; kinda. I’m probably going
to make some fun <code>9p</code> <code>arigato</code>-based filesystems for use around my LAN, but I
don’t think I’ll be moving to use <code>debugfs</code> until I can figure out how to
ensure the connection is more resilient to changing networks, server restarts
and fixes on i/o performance. I think it was a useful exercise and is a pretty
great hack, but I don’t think this’ll be shipping anywhere anytime soon.</p>
<p>Along with me publishing this post, I’ve pushed up all my repos; so you
should be able to play along at home! There’s a lot more work to be done
on <code>arigato</code>; but it does handshake and successfully export a working
<code>9P2000.u</code> filesystem. Check it out on on my github at
<a href="https://github.com/paultag/arigato">arigato</a>,
<a href="https://github.com/paultag/debugfs">debugfs</a>
and also on <a href="https://crates.io/crates/arigato">crates.io</a>
and <a href="https://docs.rs/arigato">docs.rs</a>.</p>
<p>At least I can say I was here and I got it working after all these years.</p> Using PKCS#11 on GNU/Linux https://notes.pault.ag/pkcs11/Sun, 07 Aug 2016 20:17:00 -0500 https://notes.pault.ag/pkcs11/ <p>PKCS#11 is a standard API to interface with HSMs, Smart Cards, or other types
of random hardware backed crypto. On my travel laptop, I use a few Yubikeys in
PKCS#11 mode using OpenSC to handle system login. <code>libpam-pkcs11</code> is a pretty
easy to use module that will let you log into your system locally using a
PKCS#11 token locally.</p>
<p>One of the least documented things, though, was how to use an OpenSC PKCS#11
token in Chrome. First, close all web browsers you have open.</p>
<pre>
sudo apt-get install libnss3-tools
certutil -U -d sql:$HOME/.pki/nssdb
modutil -add "OpenSC" -libfile /usr/lib/x86_64-linux-gnu/opensc-pkcs11.so -dbdir sql:$HOME/.pki/nssdb
modutil -list "OpenSC" -dbdir sql:$HOME/.pki/nssdb
modutil -enable "OpenSC" -dbdir sql:$HOME/.pki/nssdb
</pre>
<p>Now, we’ll have the PKCS#11 module ready for <code>nss</code> to use, so let’s double
check that the tokens are registered:</p>
<pre>
certutil -U -d sql:$HOME/.pki/nssdb
certutil -L -h "OpenSC" -d sql:$HOME/.pki/nssdb
</pre>
<p>If this winds up causing issues, you can remove it using the following
command:</p>
<pre>
modutil -delete "OpenSC" -dbdir sql:$HOME/.pki/nssdb
</pre> Hacking a Projector in Hy https://notes.pault.ag/hacking-a-projector-in-hy/Sun, 31 Jul 2016 12:02:00 -0500 https://notes.pault.ag/hacking-a-projector-in-hy/ <p>About a year ago, I bought a Projector after I finally admitted that I could
actually use a TV in my apartment. I settled on buying a
<a href="https://ap.viewsonic.com/il/products/projectors/PJD5132.php">ViewSonic PJD5132</a>.
It was a really great value, and has been nothing short of a delight to own.</p>
<p>I was always a bit curious about the DB9 connector on the back of the unit,
so I dug into the user manual, and found some hex code strings in there. So,
last year, between my last gig at the
<a href="https://sunlightfoundation.com/">Sunlight Foundtion</a> and
<a href="https://www.usds.gov/">USDS</a>, I spent some time wandering around the US,
hitting up <a href="https://debconf15.debconf.org/">DebConf</a>, and exploring Washington
DC. Between trips, I set out to figure out exactly what was going on with my
Projector, and see if I could make it do anything fun.</p>
<p>So, I started off with basics, and tried to work out how these command codes
were structured. I had a few working codes, but to write clean code, I’d be
better off understanding why the codes looked like they do. Let’s look at
the “Power On” code.</p>
<pre>
0x06 0x14 0x00 0x04 0x00 0x34 0x11 0x00 0x00 0x5D
</pre>
<p>Some were 10 bytes, other were 11, and most started with similar looking
things. The first byte was usually a <code>0x06</code> or <code>0x07</code>, followed by two
bytes <code>0x14 0x00</code>, and either a <code>0x04</code> or <code>0x05</code>. Since the first few bytes
were similarly structured, I assumed the first octet (either <code>0x06</code> or <code>0x07</code>)
was actually a length, since the first 4 octets seemed always present.</p>
<p>So, my best guess is that we have a Length byte at index 0, followed by
two bytes for the Protocol, a flag for if you’re Reading or Writing (best
guess on that one), and opaque data following that. Sometimes it’s a const
of sorts, and sometimes an octet (either little or big endian, confusingly).</p>
<aside class="left">
These are all just wild guesses, but thinking of it like this has actually
helped a bit, so I'm just going to use this as my working understanding
and adjust as needed.
</aside>
<pre>
Length
| Read / Write
| |
| Protocol | Data
| |----| | |------------------------|
0x06 0x14 0x00 0x04 0x00 0x34 0x11 0x00 0x00 0x5D
</pre>
<p>Right. OK. So, let’s get to work. In the spirit of code is data, data is code,
I hacked up some of the projector codes into a s-expression we can use later.
The structure of this is boring, but it’ll let us both store the command
code to issue, as well as define the handler to read the data back.</p>
<pre>
(setv *commands*
; function type family control
'((power-on nil nil (0x06 0x14 0x00 0x04 0x00 0x34 0x11 0x00 0x00 0x5D))
(power-off nil nil (0x06 0x14 0x00 0x04 0x00 0x34 0x11 0x01 0x00 0x5E))
(power-status const power (0x07 0x14 0x00 0x05 0x00 0x34 0x00 0x00 0x11 0x00 0x5E))
(reset nil nil (0x06 0x14 0x00 0x04 0x00 0x34 0x11 0x02 0x00 0x5F))
...
</pre>
<p>As well as defining some of the const responses that come back from the
Projector itself. These are pretty boring, but it’s helpful to put a
name to the int that falls out.</p>
<pre>
(setv *consts*
'((power ((on (0x00 0x00 0x01))
(off (0x00 0x00 0x00))))
(freeze ((on (0x00 0x00 0x01))
(off (0x00 0x00 0x00))))
...
</pre>
<p>After defining a few simple functions to write the byte arrays to the serial
port as well as reading and understanding responses from the projector, I could
start elaborating on some higher order functions that can talk projector. So
the first thing I wrote was to make a function that converts the command
entry into a native Hy function.</p>
<pre>
(defn make-api-function [function type family data]
`(defn ~function [serial]
(import [PJD5132.dsl [interpret-response]]
[PJD5132.serial [read-response/raw]])
(serial.write (bytearray [~@data]))
(interpret-response ~(str type) ~(str family) (read-response/raw serial))))
</pre>
<p>Fun. Fun! Now, we can invoke it to create a Hy & Python importable API wrapper
in a few lines!</p>
<pre>
(import [PJD5132.commands [*commands*]]
[PJD5132.dsl [make-api-function]])
(list (map (fn [(, function type family command)]
(make-api-function function type family command)) *commands*)))
</pre>
<p>Cool. So, now we can import things like <code>power-on</code> from <code>*commands*</code> which
takes a single argument (<code>serial</code>) for the serial port, and it’ll send a
command, and return the response. The best part about all this is you only
have to define the data once in a list, and the rest comes for free.</p>
<p>Finally, I do want to be able to turn my projector on and off over the network
so I went ahead and make a Flask “API” on top of this. First, let’s define
a macro to define Flask routes:</p>
<pre>
(defmacro defroute [name root &rest methods]
(import os.path)
(defn generate-method [path method status]
`(with-decorator (app.route ~path) (fn []
(import [PJD5132.api [~method ~(if status status method)]])
(try (do (setv ret (~method serial-line))
~(if status `(setv ret (~status serial-line)))
(json.dumps ret))
(except [e ValueError]
(setv response (make-response (.format "Fatal Error: ValueError: {}" (str e))))
(setv response.status-code 500)
response)))))
(setv path (.format "/projector/{}" name))
(setv actions (dict methods))
`(do ~(generate-method path root nil)
~@(list-comp (generate-method (os.path.join path method-path) method root)
[(, method-path method) methods])))
</pre>
<p>Now, we can define how we want our API to look, so let’s define the <code>power</code>
route, which will expand out into the Flask route code above.</p>
<pre>
(defroute power
power-status
("on" power-on)
("off" power-off))
</pre>
<p>And now, let’s play with it!</p>
<pre>
$ curl https://192.168.1.50/projector/power
"off"
$ curl https://192.168.1.50/projector/power/on
"on"
$ curl https://192.168.1.50/projector/power
"on"
</pre>
<p>Or, the volume!</p>
<pre>
$ curl 192.168.1.50/projector/volume
10
$ curl 192.168.1.50/projector/volume/decrease
9
$ curl 192.168.1.50/projector/volume/decrease
8
$ curl 192.168.1.50/projector/volume/decrease
7
$ curl 192.168.1.50/projector/volume/increase
8
$ curl 192.168.1.50/projector/volume/increase
9
$ curl 192.168.1.50/projector/volume/increase
10
</pre>
<p>Check out the full source over at <a href="https://github.com/paultag/PJD5132/">github.com/paultag/PJD5132</a>!</p> The Open Source License API https://notes.pault.ag/osi-license-api/Sat, 16 Jul 2016 15:30:00 -0500 https://notes.pault.ag/osi-license-api/ <p>Around a year ago, I started hacking together a machine readable version
of the OSI approved licenses list, and casually picking parts up until it
was ready to launch. A few weeks ago, we officially announced
the <a href="https://opensource.org/node/822">osi license api</a>, which is now
live at <a href="https://api.opensource.org/">api.opensource.org</a>.</p>
<p>I also took a whack at writing a few API bindings, in
<a href="https://github.com/opensourceorg/python-opensource">Python</a>,
<a href="https://github.com/opensourceorg/ruby-opensourceapi">Ruby</a>,
and using the models from the API implementation itself in
<a href="https://github.com/OpenSourceOrg/api/tree/master/client">Go</a>. In the following
few weeks, <a href="https://github.com/clinty">Clint</a> wrote one in <a href="https://github.com/OpenSourceOrg/haskell-opensource">Haskell</a>,
<a href="https://mornie.org/">Eriol</a> wrote one in <a href="https://github.com/opensourceorg/rust-opensource">Rust</a>,
and <a href="https://ironholds.org/">Oliver</a> wrote one in <a href="https://cran.r-project.org/web/packages/osi/">R</a>.</p>
<p>The data is sourced from a <a href="https://github.com/opensourceorg/licenses">repo on GitHub</a>,
the <code>licenses</code> repo under <code>OpenSourceOrg</code>. Pull Requests against that repo are
wildly encouraged! Additional data ideas, cleanup or more hand collected data
would be wonderful!</p>
<p>In the meantime, use-cases for using this API range from language package
managers pulling OSI approval of a licence programmatically to using a license
identifier as defined in one dataset (SPDX, for example), and using that
to find the identifier as it exists in another system (DEP5, Wikipedia,
TL;DR Legal).</p>
<p>Patches are hugely welcome, as are bug reports or ideas! I’d also love more
API wrappers for other languages!</p> Hello, InfluxDB https://notes.pault.ag/hello-influxdb/Sat, 02 Jul 2016 13:13:00 -0500 https://notes.pault.ag/hello-influxdb/ <p>Last week, I posted about <a href="https://notes.pault.ag/hello-sense/">python-sense</a>,
and API wrapper for the internal Sense API. I wrote this so that I could
pull data about myself into my own databases, allowing me to use that
information for myself.</p>
<p>One way I’m doing this is by pulling my room data into an
<a href="https://influxdata.com/">InfluxDB</a> database, letting me run time series
queries against my environmental data.</p>
<pre>
#!/usr/bin/env python
from influxdb import InfluxDBClient
import json
import datetime as dt
from sense.service import Sense
api = Sense()
data = api.room_sensors(quantity=20)
def items(data):
for flavor, series in data.items():
for datum in reversed(series):
value = datum['value']
if value == -1:
continue
timezone = dt.timezone(dt.timedelta(
seconds=datum['offset_millis'] / 1000,
))
when = dt.datetime.fromtimestamp(
datum['datetime'] / 1000,
).replace(tzinfo=timezone)
yield flavor, when, value
client = InfluxDBClient(
'url.to.host.here',
443,
'username',
'password',
'sense',
ssl=True,
)
def series(data):
for flavor, when, value in items(data):
yield {
"measurement": "{}".format(flavor),
"tags": {
"user": "paultag"
},
"time": when.isoformat(),
"fields": {
"value": value,
}
}
client.write_points(list(series(data)))
</pre>
<p>I’m able to run this on a cron, automatically loading data from the Sense
API into my Influx database. I can then use that with something like
<a href="https://grafana.org/">Grafana</a>, to check out what my room looks like over
time.</p>
<p><img src="https://notes.pault.ag/static/posts/hello-influx/sense-influx-light.png" alt=""></p>
<p><img src="https://notes.pault.ag/static/posts/hello-influx/sense-influx-temp.png" alt=""></p> Hello, Sense! https://notes.pault.ag/hello-sense/Sun, 26 Jun 2016 21:42:00 -0500 https://notes.pault.ag/hello-sense/ <p>A while back, I saw a <a href="https://www.kickstarter.com/projects/hello/sense-know-more-sleep-better">Kickstarter</a>
for one of the most well designed and pretty sleep trackers on the market. I
fell in love with it, and it has stuck with me since.</p>
<p>A few months ago, I finally got my hands on one and started to track my data.
Naturally, I now want to store this new data with the rest of the data I have
on myself in my own databases.</p>
<p>I went in search of an API, but I found that the Sense API hasn’t been published
yet, and is being worked on by the team. Here’s hoping it’ll land soon!</p>
<p>After some subdomain guessing, I hit on <a href="https://api.hello.is">api.hello.is</a>.
So, naturally, I went to take a quick look at their Android app and network
traffic, lo and behold, there was a pretty nicely designed API.</p>
<p>This API is clearly an internal API, and as such, it’s something that
<strong>should not</strong> be considered stable. However, I’m OK with a fragile API,
so <a href="https://github.com/paultag/python-sense">I’ve published a quick and dirty API wrapper for the Sense API
to my GitHub.</a>.</p>
<p>I’ve published it because I’ve found it useful, but I can’t promise the world,
(since I’m not a member of the Sense team at Hello!), so here are a few ground
rules of this wrapper:</p>
<ul>
<li>I make no claims to the stability or completeness.</li>
<li>I have no documentation or assurances.</li>
<li>I will not provide the client secret and ID. You’ll have to find them on
your own.</li>
<li>This may stop working without any notice, and there may even be really nasty
bugs that result in your alarm going off at 4 AM.</li>
<li>Send PRs! This is a side-project for me.</li>
</ul>
<p>This module is currently Python 3 only. If someone really needs Python 2
support, I’m open to minimally invasive patches to the codebase using
<code>six</code> to support Python 2.7.</p>
<h2 id="working-with-the-api">Working with the API:</h2>
<p>First, let’s go ahead and log in using <code>python -m sense</code>.</p>
<pre>
$ python -m sense
Sense OAuth Client ID: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
Sense OAuth Client Secret: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
Sense email: paultag@gmail.com
Sense password:
Attempting to log into Sense's API
Success!
Attempting to query the Sense API
The humidity is **just right**.
The air quality is **just right**.
The light level is **just right**.
It's **pretty hot** in here.
The noise level is **just right**.
Success!
</pre>
<p>Now, let’s see if we can pull up information on my Sense:</p>
<pre>
>>> from sense import Sense
>>> sense = Sense()
>>> sense.devices()
{'senses': [{'id': 'xxxxxxxxxxxxxxxx', 'firmware_version': '11a1', 'last_updated': 1466991060000, 'state': 'NORMAL', 'wifi_info': {'rssi': 0, 'ssid': 'Pretty Fly for a WiFi (2.4 GhZ)', 'condition': 'GOOD', 'last_updated': 1462927722000}, 'color': 'BLACK'}], 'pills': [{'id': 'xxxxxxxxxxxxxxxx', 'firmware_version': '2', 'last_updated': 1466990339000, 'battery_level': 87, 'color': 'BLUE', 'state': 'NORMAL'}]}
</pre>
<p>Neat! Pretty cool. Look, you can even see my WiFi AP! Let’s try some more
and pull some trends out.</p>
<pre>
>>> values = [x.get("value") for x in sense.room_sensors()["humidity"]][:10]
>>> min(values)
45.73904
>>> max(values)
45.985928
>>>
</pre>
<p>I plan to keep maintaining it as long as it’s needed, so I welcome
co-maintainers, and I’d love to see what people build with it! So far, I’m
using it to dump my room data into InfluxDB, pulling information on my room
into Grafana. Hopefully more to come!</p>
<p>Happy hacking!</p> Go Debian! https://notes.pault.ag/go-debian/Sun, 19 Jun 2016 12:30:00 -0500 https://notes.pault.ag/go-debian/ <p>As some of the world knows full well by now, I’ve been noodling with Go
for a few years, working through its pros, its cons, and thinking a lot
about how humans use code to express thoughts and ideas. Go’s got a lot of
neat use cases, suited to particular problems, and used in the right place,
you can see some clear massive wins.</p>
<aside class="left">
Some of the things Go is great at: Writing a server. Dealing with asynchronous
communication. Backend and front-end in the same binary. Fast and memory safe.
</aside>
<aside class="right">
Things Go is bad at: Having to rebuild everything for a CVE. Having if
`err != nil` everywhere. "Better than C" being the excuse for bad semantics.
No generics, cgo (enough said)
</aside>
<p>I’ve started writing Debian tooling in Go, because it’s a pretty natural fit.
Go’s fairly tight, and overhead shouldn’t be taken up by your operating system.
After a while, I wound up hitting the usual blockers, and started to build up
abstractions. They became pretty darn useful, so, this blog post is announcing
(a still incomplete, year old and perhaps API changing) Debian package for Go.
The Go importable name is <code>pault.ag/go/debian</code>. This contains a lot of utilities
for dealing with Debian packages, and will become an edited down “toolbelt”
for working with or on Debian packages.</p>
<h1 id="module-overview">Module Overview</h1>
<p>Currently, the package contains 4 major sub packages. They’re a <code>changelog</code>
parser, a <code>control</code> file parser, <code>deb</code> file format parser, <code>dependency</code> parser
and a <code>version</code> parser. Together, these are a set of powerful building blocks
which can be used together to create higher order systems with reliable
understandings of the world.</p>
<h2 id="changelog">changelog</h2>
<p>The first (and perhaps most incomplete and least tested) is a <a href="https://godoc.org/pault.ag/go/debian/changelog">changelog file
parser.</a>. This provides the
programmer with the ability to pull out the suite being targeted in the
changelog, when each upload was, and the version for each. For example, let’s
look at how we can pull when all the uploads of Docker to sid took place:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-go" data-lang="go"><span style="display:flex;"><span><span style="color:#66d9ef">func</span> <span style="color:#a6e22e">main</span>() {
</span></span><span style="display:flex;"><span> <span style="color:#a6e22e">resp</span>, <span style="color:#a6e22e">err</span> <span style="color:#f92672">:=</span> <span style="color:#a6e22e">http</span>.<span style="color:#a6e22e">Get</span>(<span style="color:#e6db74">"https://metadata.ftp-master.debian.org/changelogs/main/d/docker.io/unstable_changelog"</span>)
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">if</span> <span style="color:#a6e22e">err</span> <span style="color:#f92672">!=</span> <span style="color:#66d9ef">nil</span> {
</span></span><span style="display:flex;"><span> panic(<span style="color:#a6e22e">err</span>)
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span> <span style="color:#a6e22e">allEntries</span>, <span style="color:#a6e22e">err</span> <span style="color:#f92672">:=</span> <span style="color:#a6e22e">changelog</span>.<span style="color:#a6e22e">Parse</span>(<span style="color:#a6e22e">resp</span>.<span style="color:#a6e22e">Body</span>)
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">if</span> <span style="color:#a6e22e">err</span> <span style="color:#f92672">!=</span> <span style="color:#66d9ef">nil</span> {
</span></span><span style="display:flex;"><span> panic(<span style="color:#a6e22e">err</span>)
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">for</span> <span style="color:#a6e22e">_</span>, <span style="color:#a6e22e">entry</span> <span style="color:#f92672">:=</span> <span style="color:#66d9ef">range</span> <span style="color:#a6e22e">allEntries</span> {
</span></span><span style="display:flex;"><span> <span style="color:#a6e22e">fmt</span>.<span style="color:#a6e22e">Printf</span>(<span style="color:#e6db74">"Version %s was uploaded on %s\n"</span>, <span style="color:#a6e22e">entry</span>.<span style="color:#a6e22e">Version</span>, <span style="color:#a6e22e">entry</span>.<span style="color:#a6e22e">When</span>)
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>The output of which looks like:</p>
<pre tabindex="0"><code>Version 1.8.3~ds1-2 was uploaded on 2015-11-04 00:09:02 -0800 -0800
Version 1.8.3~ds1-1 was uploaded on 2015-10-29 19:40:51 -0700 -0700
Version 1.8.2~ds1-2 was uploaded on 2015-10-29 07:23:10 -0700 -0700
Version 1.8.2~ds1-1 was uploaded on 2015-10-28 14:21:00 -0700 -0700
Version 1.7.1~dfsg1-1 was uploaded on 2015-08-26 10:13:48 -0700 -0700
Version 1.6.2~dfsg1-2 was uploaded on 2015-07-01 07:45:19 -0600 -0600
Version 1.6.2~dfsg1-1 was uploaded on 2015-05-21 00:47:43 -0600 -0600
Version 1.6.1+dfsg1-2 was uploaded on 2015-05-10 13:02:54 -0400 EDT
Version 1.6.1+dfsg1-1 was uploaded on 2015-05-08 17:57:10 -0600 -0600
Version 1.6.0+dfsg1-1 was uploaded on 2015-05-05 15:10:49 -0600 -0600
Version 1.6.0+dfsg1-1~exp1 was uploaded on 2015-04-16 18:00:21 -0600 -0600
Version 1.6.0~rc7~dfsg1-1~exp1 was uploaded on 2015-04-15 19:35:46 -0600 -0600
Version 1.6.0~rc4~dfsg1-1 was uploaded on 2015-04-06 17:11:33 -0600 -0600
Version 1.5.0~dfsg1-1 was uploaded on 2015-03-10 22:58:49 -0600 -0600
Version 1.3.3~dfsg1-2 was uploaded on 2015-01-03 00:11:47 -0700 -0700
Version 1.3.3~dfsg1-1 was uploaded on 2014-12-18 21:54:12 -0700 -0700
Version 1.3.2~dfsg1-1 was uploaded on 2014-11-24 19:14:28 -0500 EST
Version 1.3.1~dfsg1-2 was uploaded on 2014-11-07 13:11:34 -0700 -0700
Version 1.3.1~dfsg1-1 was uploaded on 2014-11-03 08:26:29 -0700 -0700
Version 1.3.0~dfsg1-1 was uploaded on 2014-10-17 00:56:07 -0600 -0600
Version 1.2.0~dfsg1-2 was uploaded on 2014-10-09 00:08:11 +0000 +0000
Version 1.2.0~dfsg1-1 was uploaded on 2014-09-13 11:43:17 -0600 -0600
Version 1.0.0~dfsg1-1 was uploaded on 2014-06-13 21:04:53 -0400 EDT
Version 0.11.1~dfsg1-1 was uploaded on 2014-05-09 17:30:45 -0400 EDT
Version 0.9.1~dfsg1-2 was uploaded on 2014-04-08 23:19:08 -0400 EDT
Version 0.9.1~dfsg1-1 was uploaded on 2014-04-03 21:38:30 -0400 EDT
Version 0.9.0+dfsg1-1 was uploaded on 2014-03-11 22:24:31 -0400 EDT
Version 0.8.1+dfsg1-1 was uploaded on 2014-02-25 20:56:31 -0500 EST
Version 0.8.0+dfsg1-2 was uploaded on 2014-02-15 17:51:58 -0500 EST
Version 0.8.0+dfsg1-1 was uploaded on 2014-02-10 20:41:10 -0500 EST
Version 0.7.6+dfsg1-1 was uploaded on 2014-01-22 22:50:47 -0500 EST
Version 0.7.1+dfsg1-1 was uploaded on 2014-01-15 20:22:34 -0500 EST
Version 0.6.7+dfsg1-3 was uploaded on 2014-01-09 20:10:20 -0500 EST
Version 0.6.7+dfsg1-2 was uploaded on 2014-01-08 19:14:02 -0500 EST
Version 0.6.7+dfsg1-1 was uploaded on 2014-01-07 21:06:10 -0500 EST
</code></pre><h2 id="control">control</h2>
<p>Next is one of the most complex, and one of the oldest parts of <code>go-debian</code>,
which is the <a href="https://godoc.org/pault.ag/go/debian/control">control file parser</a>
(otherwise sometimes known as <code>deb822</code>). This module was inspired by the way
that the <code>json</code> module works in Go, allowing for files to be defined in code
with a <code>struct</code>. This tends to be a bit more declarative, but also winds up
putting logic into struct tags, which can be a nasty anti-pattern if used too
much.</p>
<p>The first primitive in this module is the concept of a <code>Paragraph</code>, a struct
containing two values, the order of keys seen, and a map of <code>string</code> to <code>string</code>.
All higher order functions dealing with control files will go through this
type, which is a helpful interchange format to be aware of. All parsing of
meaning from the Control file happens when the Paragraph is unpacked into
a struct using reflection.</p>
<p>The idea behind this strategy that you define your struct, and let the Control
parser handle unpacking the data from the IO into your container, letting you
maintain type safety, since you never have to read and cast, the conversion
will handle this, and return an Unmarshaling error in the event of failure.</p>
<aside class="right">
I'm starting to think parsing and defining the control structs are two different
tasks and should be split apart -- or the common structs ought to be removed
entirely. More on this later.
</aside>
<p>Additionally, Structs that define an anonymous member of <code>control.Paragraph</code>
will have the raw <code>Paragraph</code> struct of the underlying file, allowing the
programmer to handle dynamic tags (such as <code>X-Foo</code>), or at least, letting
them survive the round-trip through go.</p>
<p>The default <a href="https://godoc.org/pault.ag/go/debian/control#NewDecoder">decoder</a>
contains an argument, the ability to verify the input control file using an
OpenPGP keyring, which is exposed to the programmer through the
<code>(*Decoder).Signer()</code> function. If the passed argument is nil, it will not
check the input file signature (at all!), and if it has been passed, any
signed data must be found or an <code>error</code> will fall out of the <code>NewDecoder</code> call.
On the way out, the opposite happens, where the struct is introspected,
turned into a <code>control.Paragraph</code>, and then written out to the <code>io.Writer</code>.</p>
<p>Here’s a quick (and VERY dirty) example showing the basics of reading and
writing Debian Control files with <code>go-debian</code>.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-go" data-lang="go"><span style="display:flex;"><span><span style="color:#f92672">package</span> <span style="color:#a6e22e">main</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#f92672">import</span> (
</span></span><span style="display:flex;"><span> <span style="color:#e6db74">"fmt"</span>
</span></span><span style="display:flex;"><span> <span style="color:#e6db74">"io"</span>
</span></span><span style="display:flex;"><span> <span style="color:#e6db74">"net/http"</span>
</span></span><span style="display:flex;"><span> <span style="color:#e6db74">"strings"</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> <span style="color:#e6db74">"pault.ag/go/debian/control"</span>
</span></span><span style="display:flex;"><span>)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">type</span> <span style="color:#a6e22e">AllowedPackage</span> <span style="color:#66d9ef">struct</span> {
</span></span><span style="display:flex;"><span> <span style="color:#a6e22e">Package</span> <span style="color:#66d9ef">string</span>
</span></span><span style="display:flex;"><span> <span style="color:#a6e22e">Fingerprint</span> <span style="color:#66d9ef">string</span>
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">func</span> (<span style="color:#a6e22e">a</span> <span style="color:#f92672">*</span><span style="color:#a6e22e">AllowedPackage</span>) <span style="color:#a6e22e">UnmarshalControl</span>(<span style="color:#a6e22e">in</span> <span style="color:#66d9ef">string</span>) <span style="color:#66d9ef">error</span> {
</span></span><span style="display:flex;"><span> <span style="color:#a6e22e">in</span> = <span style="color:#a6e22e">strings</span>.<span style="color:#a6e22e">TrimSpace</span>(<span style="color:#a6e22e">in</span>)
</span></span><span style="display:flex;"><span> <span style="color:#a6e22e">chunks</span> <span style="color:#f92672">:=</span> <span style="color:#a6e22e">strings</span>.<span style="color:#a6e22e">SplitN</span>(<span style="color:#a6e22e">in</span>, <span style="color:#e6db74">" "</span>, <span style="color:#ae81ff">2</span>)
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">if</span> len(<span style="color:#a6e22e">chunks</span>) <span style="color:#f92672">!=</span> <span style="color:#ae81ff">2</span> {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">return</span> <span style="color:#a6e22e">fmt</span>.<span style="color:#a6e22e">Errorf</span>(<span style="color:#e6db74">"Syntax sucks: '%s'"</span>, <span style="color:#a6e22e">in</span>)
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span> <span style="color:#a6e22e">a</span>.<span style="color:#a6e22e">Package</span> = <span style="color:#a6e22e">chunks</span>[<span style="color:#ae81ff">0</span>]
</span></span><span style="display:flex;"><span> <span style="color:#a6e22e">a</span>.<span style="color:#a6e22e">Fingerprint</span> = <span style="color:#a6e22e">chunks</span>[<span style="color:#ae81ff">1</span>][<span style="color:#ae81ff">1</span> : len(<span style="color:#a6e22e">chunks</span>[<span style="color:#ae81ff">1</span>])<span style="color:#f92672">-</span><span style="color:#ae81ff">1</span>]
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">return</span> <span style="color:#66d9ef">nil</span>
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">type</span> <span style="color:#a6e22e">DMUA</span> <span style="color:#66d9ef">struct</span> {
</span></span><span style="display:flex;"><span> <span style="color:#a6e22e">Fingerprint</span> <span style="color:#66d9ef">string</span>
</span></span><span style="display:flex;"><span> <span style="color:#a6e22e">Uid</span> <span style="color:#66d9ef">string</span>
</span></span><span style="display:flex;"><span> <span style="color:#a6e22e">AllowedPackages</span> []<span style="color:#a6e22e">AllowedPackage</span> <span style="color:#e6db74">`control:"Allow" delim:","`</span>
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">func</span> <span style="color:#a6e22e">main</span>() {
</span></span><span style="display:flex;"><span> <span style="color:#a6e22e">resp</span>, <span style="color:#a6e22e">err</span> <span style="color:#f92672">:=</span> <span style="color:#a6e22e">http</span>.<span style="color:#a6e22e">Get</span>(<span style="color:#e6db74">"https://metadata.ftp-master.debian.org/dm.txt"</span>)
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">if</span> <span style="color:#a6e22e">err</span> <span style="color:#f92672">!=</span> <span style="color:#66d9ef">nil</span> {
</span></span><span style="display:flex;"><span> panic(<span style="color:#a6e22e">err</span>)
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> <span style="color:#a6e22e">decoder</span>, <span style="color:#a6e22e">err</span> <span style="color:#f92672">:=</span> <span style="color:#a6e22e">control</span>.<span style="color:#a6e22e">NewDecoder</span>(<span style="color:#a6e22e">resp</span>.<span style="color:#a6e22e">Body</span>, <span style="color:#66d9ef">nil</span>)
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">if</span> <span style="color:#a6e22e">err</span> <span style="color:#f92672">!=</span> <span style="color:#66d9ef">nil</span> {
</span></span><span style="display:flex;"><span> panic(<span style="color:#a6e22e">err</span>)
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">for</span> {
</span></span><span style="display:flex;"><span> <span style="color:#a6e22e">dmua</span> <span style="color:#f92672">:=</span> <span style="color:#a6e22e">DMUA</span>{}
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">if</span> <span style="color:#a6e22e">err</span> <span style="color:#f92672">:=</span> <span style="color:#a6e22e">decoder</span>.<span style="color:#a6e22e">Decode</span>(<span style="color:#f92672">&</span><span style="color:#a6e22e">dmua</span>); <span style="color:#a6e22e">err</span> <span style="color:#f92672">!=</span> <span style="color:#66d9ef">nil</span> {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">if</span> <span style="color:#a6e22e">err</span> <span style="color:#f92672">==</span> <span style="color:#a6e22e">io</span>.<span style="color:#a6e22e">EOF</span> {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">break</span>
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span> panic(<span style="color:#a6e22e">err</span>)
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span> <span style="color:#a6e22e">fmt</span>.<span style="color:#a6e22e">Printf</span>(<span style="color:#e6db74">"The DM %s is allowed to upload:\n"</span>, <span style="color:#a6e22e">dmua</span>.<span style="color:#a6e22e">Uid</span>)
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">for</span> <span style="color:#a6e22e">_</span>, <span style="color:#a6e22e">allowedPackage</span> <span style="color:#f92672">:=</span> <span style="color:#66d9ef">range</span> <span style="color:#a6e22e">dmua</span>.<span style="color:#a6e22e">AllowedPackages</span> {
</span></span><span style="display:flex;"><span> <span style="color:#a6e22e">fmt</span>.<span style="color:#a6e22e">Printf</span>(<span style="color:#e6db74">" %s [granted by %s]\n"</span>, <span style="color:#a6e22e">allowedPackage</span>.<span style="color:#a6e22e">Package</span>, <span style="color:#a6e22e">allowedPackage</span>.<span style="color:#a6e22e">Fingerprint</span>)
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>Output (truncated!) looks a bit like:</p>
<pre tabindex="0"><code>...
The DM Allison Randal <allison@lohutok.net> is allowed to upload:
parrot [granted by A4F455C3414B10563FCC9244AFA51BD6CDE573CB]
...
The DM Benjamin Barenblat <bbaren@mit.edu> is allowed to upload:
boogie [granted by 3224C4469D7DF8F3D6F41A02BBC756DDBE595F6B]
dafny [granted by 3224C4469D7DF8F3D6F41A02BBC756DDBE595F6B]
transmission-remote-gtk [granted by 3224C4469D7DF8F3D6F41A02BBC756DDBE595F6B]
urweb [granted by 3224C4469D7DF8F3D6F41A02BBC756DDBE595F6B]
...
The DM أحمد المحمودي <aelmahmoudy@sabily.org> is allowed to upload:
covered [granted by 41352A3B4726ACC590940097F0A98A4C4CD6E3D2]
dico [granted by 6ADD5093AC6D1072C9129000B1CCD97290267086]
drawtiming [granted by 41352A3B4726ACC590940097F0A98A4C4CD6E3D2]
fonts-hosny-amiri [granted by BD838A2BAAF9E3408BD9646833BE1A0A8C2ED8FF]
...
...
</code></pre><h2 id="deb">deb</h2>
<p>Next up, we’ve got the <code>deb</code> module. This contains code to handle reading
Debian 2.0 <code>.deb</code> files. It contains a wrapper that will parse the control
member, and provide the data member through the
<a href="https://godoc.org/archive/tar">archive/tar</a> interface.</p>
<p>Here’s an example of how to read a <code>.deb</code> file, access some metadata, and
iterate over the <code>tar</code> archive, and print the filenames of each of the
entries.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-go" data-lang="go"><span style="display:flex;"><span><span style="color:#66d9ef">func</span> <span style="color:#a6e22e">main</span>() {
</span></span><span style="display:flex;"><span> <span style="color:#a6e22e">path</span> <span style="color:#f92672">:=</span> <span style="color:#e6db74">"/tmp/fluxbox_1.3.5-2+b1_amd64.deb"</span>
</span></span><span style="display:flex;"><span> <span style="color:#a6e22e">fd</span>, <span style="color:#a6e22e">err</span> <span style="color:#f92672">:=</span> <span style="color:#a6e22e">os</span>.<span style="color:#a6e22e">Open</span>(<span style="color:#a6e22e">path</span>)
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">if</span> <span style="color:#a6e22e">err</span> <span style="color:#f92672">!=</span> <span style="color:#66d9ef">nil</span> {
</span></span><span style="display:flex;"><span> panic(<span style="color:#a6e22e">err</span>)
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">defer</span> <span style="color:#a6e22e">fd</span>.<span style="color:#a6e22e">Close</span>()
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> <span style="color:#a6e22e">debFile</span>, <span style="color:#a6e22e">err</span> <span style="color:#f92672">:=</span> <span style="color:#a6e22e">deb</span>.<span style="color:#a6e22e">Load</span>(<span style="color:#a6e22e">fd</span>, <span style="color:#a6e22e">path</span>)
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">if</span> <span style="color:#a6e22e">err</span> <span style="color:#f92672">!=</span> <span style="color:#66d9ef">nil</span> {
</span></span><span style="display:flex;"><span> panic(<span style="color:#a6e22e">err</span>)
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> <span style="color:#a6e22e">version</span> <span style="color:#f92672">:=</span> <span style="color:#a6e22e">debFile</span>.<span style="color:#a6e22e">Control</span>.<span style="color:#a6e22e">Version</span>
</span></span><span style="display:flex;"><span> <span style="color:#a6e22e">fmt</span>.<span style="color:#a6e22e">Printf</span>(
</span></span><span style="display:flex;"><span> <span style="color:#e6db74">"Epoch: %d, Version: %s, Revision: %s\n"</span>,
</span></span><span style="display:flex;"><span> <span style="color:#a6e22e">version</span>.<span style="color:#a6e22e">Epoch</span>, <span style="color:#a6e22e">version</span>.<span style="color:#a6e22e">Version</span>, <span style="color:#a6e22e">version</span>.<span style="color:#a6e22e">Revision</span>,
</span></span><span style="display:flex;"><span> )
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">for</span> {
</span></span><span style="display:flex;"><span> <span style="color:#a6e22e">hdr</span>, <span style="color:#a6e22e">err</span> <span style="color:#f92672">:=</span> <span style="color:#a6e22e">debFile</span>.<span style="color:#a6e22e">Data</span>.<span style="color:#a6e22e">Next</span>()
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">if</span> <span style="color:#a6e22e">err</span> <span style="color:#f92672">==</span> <span style="color:#a6e22e">io</span>.<span style="color:#a6e22e">EOF</span> {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">break</span>
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">if</span> <span style="color:#a6e22e">err</span> <span style="color:#f92672">!=</span> <span style="color:#66d9ef">nil</span> {
</span></span><span style="display:flex;"><span> panic(<span style="color:#a6e22e">err</span>)
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span> <span style="color:#a6e22e">fmt</span>.<span style="color:#a6e22e">Printf</span>(<span style="color:#e6db74">" -> %s\n"</span>, <span style="color:#a6e22e">hdr</span>.<span style="color:#a6e22e">Name</span>)
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>Boringly, the output looks like:</p>
<pre tabindex="0"><code>Epoch: 0, Version: 1.3.5, Revision: 2+b1
-> ./
-> ./etc/
-> ./etc/menu-methods/
-> ./etc/menu-methods/fluxbox
-> ./etc/X11/
-> ./etc/X11/fluxbox/
-> ./etc/X11/fluxbox/window.menu
-> ./etc/X11/fluxbox/fluxbox.menu-user
-> ./etc/X11/fluxbox/keys
-> ./etc/X11/fluxbox/init
-> ./etc/X11/fluxbox/system.fluxbox-menu
-> ./etc/X11/fluxbox/overlay
-> ./etc/X11/fluxbox/apps
-> ./usr/
-> ./usr/share/
-> ./usr/share/man/
-> ./usr/share/man/man5/
-> ./usr/share/man/man5/fluxbox-style.5.gz
-> ./usr/share/man/man5/fluxbox-menu.5.gz
-> ./usr/share/man/man5/fluxbox-apps.5.gz
-> ./usr/share/man/man5/fluxbox-keys.5.gz
-> ./usr/share/man/man1/
-> ./usr/share/man/man1/startfluxbox.1.gz
...
</code></pre><h2 id="dependency">dependency</h2>
<p>The <code>dependency</code> package provides an interface to parse and compute
dependencies. This package is a bit odd in that, well, there’s no other
library that does this. The issue is that there are actually two different
parsers that compute our Dependency lines, one in Perl (as part of <code>dpkg-dev</code>)
and another in C (in <code>dpkg</code>).</p>
<aside class="left">
I have yet to track it down, but it's shockingly likely that `apt` has another
in `C++`, and maybe another in `aptitude`. I don't know this for a fact, so
I'll assume nothing
</aside>
<p>To date, this has resulted in me filing
<a href="https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=816473">three</a>
<a href="https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=784808">different</a>
<a href="https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=784806">bugs</a>.
I also found a broken package in the
<a href="https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=816741">archive</a>,
which actually resulted in another bug being (totally accidentally)
<a href="https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=815478">already fixed</a>.
I hope to continue to run the archive through my parser in hopes of finding
more bugs! This package is a bit complex, but it basically just returns what
amounts to be an <a href="https://en.wikipedia.org/wiki/Abstract_syntax_tree">AST</a>
for our Dependency lines. I’m positive there are bugs, so file them!</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-go" data-lang="go"><span style="display:flex;"><span><span style="color:#66d9ef">func</span> <span style="color:#a6e22e">main</span>() {
</span></span><span style="display:flex;"><span> <span style="color:#a6e22e">dep</span>, <span style="color:#a6e22e">err</span> <span style="color:#f92672">:=</span> <span style="color:#a6e22e">dependency</span>.<span style="color:#a6e22e">Parse</span>(<span style="color:#e6db74">"foo | bar, baz, foobar [amd64] | bazfoo [!sparc], fnord:armhf [gnu-linux-sparc]"</span>)
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">if</span> <span style="color:#a6e22e">err</span> <span style="color:#f92672">!=</span> <span style="color:#66d9ef">nil</span> {
</span></span><span style="display:flex;"><span> panic(<span style="color:#a6e22e">err</span>)
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> <span style="color:#a6e22e">anySparc</span>, <span style="color:#a6e22e">err</span> <span style="color:#f92672">:=</span> <span style="color:#a6e22e">dependency</span>.<span style="color:#a6e22e">ParseArch</span>(<span style="color:#e6db74">"sparc"</span>)
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">if</span> <span style="color:#a6e22e">err</span> <span style="color:#f92672">!=</span> <span style="color:#66d9ef">nil</span> {
</span></span><span style="display:flex;"><span> panic(<span style="color:#a6e22e">err</span>)
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">for</span> <span style="color:#a6e22e">_</span>, <span style="color:#a6e22e">possi</span> <span style="color:#f92672">:=</span> <span style="color:#66d9ef">range</span> <span style="color:#a6e22e">dep</span>.<span style="color:#a6e22e">GetPossibilities</span>(<span style="color:#f92672">*</span><span style="color:#a6e22e">anySparc</span>) {
</span></span><span style="display:flex;"><span> <span style="color:#a6e22e">fmt</span>.<span style="color:#a6e22e">Printf</span>(<span style="color:#e6db74">"%s (%s)\n"</span>, <span style="color:#a6e22e">possi</span>.<span style="color:#a6e22e">Name</span>, <span style="color:#a6e22e">possi</span>.<span style="color:#a6e22e">Arch</span>)
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>Gives the output:</p>
<pre tabindex="0"><code>foo (<nil>)
baz (<nil>)
fnord (armhf)
</code></pre><h2 id="version">version</h2>
<p>Right off the bat, I’d like to thank
<a href="https://twitter.com/zekjur">Michael Stapelberg</a> for letting me graft this
out of <a href="https://github.com/debian/dcs">dcs</a> and into the <code>go-debian</code> package.
This was nearly entirely his work (with a one or two line function I added
later), and was amazingly helpful to have. Thank you!</p>
<p>This module implements Debian version comparisons and parsing, allowing for
sorting in lists, checking to see if it’s native or not, and letting the
programmer to implement smart(er!) logic based on upstream (or Debian)
version numbers.</p>
<p>This module is extremely easy to use and very straightforward, and not worth
writing an example for.</p>
<h1 id="final-thoughts">Final thoughts</h1>
<p>This is more of a “Yeah, OK, this has been useful enough to me at this point
that I’m going to support this” rather than a “It’s stable!” or even
“It’s alive!” post. Hopefully folks can report bugs and help iterate on
this module until we have some really clean building blocks to build
solid higher level systems on top of. Being able to have multiple libraries
interoperate by relying on <code>go-debian</code> will be a massive ease.
I’m in need of more documentation, and to finalize some parts of the older
sub package APIs, but I’m hoping to be at a “1.0” real soon now.</p> It's all relative https://notes.pault.ag/its-all-relative/Fri, 10 Jun 2016 23:45:00 -0500 https://notes.pault.ag/its-all-relative/ <p>As nearly anyone who’s worked with me will attest to, I’ve long since
touted <a href="https://nedbatchelder.com">nedbat’s</a> talk
<a href="https://nedbatchelder.com/text/unipain.html">Pragmatic Unicode, or, How do I stop the pain?</a>
as one of the most foundational talks, and required watching for all programmers.</p>
<p>The reason is because netbat hits on something bigger - something more
fundamental than how to handle Unicode – it’s how to handle data which is
relative.</p>
<p>For those who want the TL;DR, the argument is as follows:</p>
<p>Facts of Life:</p>
<ol>
<li>Computers work with Bytes. Bytes go in, Bytes go out.</li>
<li>The world needs more than 256 symbols.</li>
<li>You need both Bytes and Unicode</li>
<li>You cannot infer the encoding of bytes.</li>
<li>Declared encodings can be Wrong</li>
</ol>
<p>Now, to fix it, the following protips:</p>
<ol>
<li><a href="https://nedbatchelder.com/text/unipain/unipain.html#35">Unicode sandwich</a></li>
<li>Know what you have</li>
<li>TEST</li>
</ol>
<h2 id="relative-data">Relative Data</h2>
<p>I’ve started to think more about why we do the things we do when we write
code, and one thing that continues to be a source of morbid schadenfreude
is watching code break by failing to handle Unicode right. It’s hard! However,
watching <em>what</em> breaks lets you gain a bit of insight into how the author
thinks, and what assumptions they make.</p>
<p>When you send someone Unicode, there are a lot of assumptions that have to be
made. Your computer has to trust what you (yes, you!) entered into your web
browser, your web browser has to pass that on over the network (most of the
time without encoding information), to a server which reads that bytestream,
and makes a wild guess at what it should be. That server might save it to a
database, and interpolate it into an HTML template in a different encoding
(called <a href="https://simple.wikipedia.org/wiki/Mojibake">Mojibake</a>), resulting
in a bad time for everyone involved.</p>
<p>Everything’s awful, and the fact our computers can continue to display
text to us is a goddamn miracle. Never forget that.</p>
<p>When it comes down to it, when I see a byte sitting on a page, I don’t know
(and can’t know!) if it’s <code>Windows-1252</code>, <code>UTF-8</code>, <code>Latin-1</code>, or <code>EBCDIC</code>. What’s
a poem to me is terminal garbage to you.</p>
<p>Over the years, hacks have evolved. We have
<a href="https://en.wikipedia.org/wiki/Magic_number_(programming)">magic numbers</a>,
and plain ole’ hacks to just guess based on the content. Of course, like
all good computer programs, this has lead to its fair share of hilarious
<a href="https://bugs.launchpad.net/ubuntu/+source/cupsys/+bug/255161/comments/28">bugs</a>,
and there’s nothing stopping files from (validly!) being multiple things at the
same time.</p>
<p><em>Like many things, it’s all in the eye of the beholder</em>.</p>
<h2 id="timezones">Timezones</h2>
<p>Just like Unicode, this is a word that can put your friendly neighborhood
programmer into a series of profanity laden tirades. Go find one in the wild,
and ask them about what they think about timezone handling bugs they’ve seen.
I’ll wait. Go ahead.</p>
<p>Rants are funny things. They’re fun to watch. Hilarious to give. Sometimes
just getting it all out can help. They can tell you a lot about the true
nature of problems.</p>
<p>It’s funny to consider the isomorphic nature of Unicode rants and Timezone
rants.</p>
<p><em>I don’t think this is an accident.</em></p>
<h2 id="unicode-timezone-sandwich">U̶n̶i̶c̶o̶d̶e̶ timezone Sandwich</h2>
<p>Ned’s Unicode Sandwich applies – As early as we can, in the lowest level
we can (reading from the database, filesystem, wherever!), all datetimes
must be timezone qualified with their correct timezone. Always. If you mean
UTC, say it’s in UTC.</p>
<p>Treat any unqualified datetimes as “bytes”. They’re not to be trusted.
<a href="https://youtu.be/W7wpzKvNhfA?t=3m18s">Never, never, never trust ’em</a>. Don’t
process any datetimes until you’re sure they’re in the right timezone.</p>
<p>This lets the delicious inside of your datetime sandwich handle timezones
with grace, and finally, as late as you can, turn it back into bytes
(if at all!). Treat locations as <code>tzdb</code> entries, and qualify datetime
objects into their absolute timezone (<code>EST</code>, <code>EDT</code>, <code>PST</code>, <code>PDT</code>)</p>
<p>It’s not until you want to show the datetime to the user again should you
consider how to re-encode your datetime to bytes. You should think about
what flavor of bytes, what encoding – what timezone – should I be
encoding into?</p>
<h2 id="test">TEST</h2>
<p>Just like Unicode, testing that your code works with datetimes is important.
Every time I think about how to go about doing this, I think about that
one time that <a href="https://mjg59.dreamwidth.org/">mjg59</a> couldn’t book a flight
starting Tuesday from AKL, landing in HNL on Monday night, because
United couldn’t book the last leg to SFO. Do you ever assume dates only go
forward as time goes on? Remember timezones.</p>
<p>Construct test data, make sure someone in New Zealand’s
<a href="https://en.wikipedia.org/wiki/UTC%2B13:45">+13:45</a> can correctly talk with
their friends in
Baker Island’s <a href="https://en.wikipedia.org/wiki/UTC%E2%88%9212:00">-12:00</a>,
and that the events sort right.</p>
<p>Just because it’s Noon on New Years Eve in England doesn’t mean it’s not
1 AM the next year in New Zealand. Places a few miles apart may go on Daylight
savings different days. Indian Standard Time is not even aligned on the hour
to GMT (<code>+05:30</code>)!</p>
<p>Test early, and test often. Memorize a few timezones, and challenge
your assumptions when writing code that has to do with time. Don’t use
wall clocks to mean monotonic time. Remember there’s a whole world out there,
and we only deal with part of it.</p>
<p>It’s also worth remembering, as <a href="https://twitter.com/andrewindc">Andrew Pendleton</a>
pointed out to me, that it’s possible that a datetime isn’t even <em>unique</em> for a
place, since you can never know if <code>2016-11-06 01:00:00</code> in <code>America/New_York</code>
(in the <code>tzdb</code>) is the first one, or second one. Storing <code>EST</code> or <code>EDT</code> along
with your datetime may help, though!</p>
<h2 id="pitfalls">Pitfalls</h2>
<p>Improper handling of timezones can lead to some interesting things, and failing
to be explicit (or at least, very rigid) in what you expect will lead to an
unholy class of bugs we’ve all come to hate. At best, you have confused
users doing math, at worst, someone misses a critical event, or our
security code fails.</p>
<p>I recently found what I regard to be a pretty bad
<a href="https://bugs.debian.org/819697">bug in apt</a> (which David has prepared a
<a href="https://anonscm.debian.org/cgit/apt/apt.git/diff/?id=9febc2b">fix</a>
for and is pending upload, yay! Thank you!), which boiled down to documentation
and code expecting datetimes in a timezone, but <em>accepting any timezone</em>, and
<em>silently</em> treating it as <code>UTC</code>.</p>
<p>The solution is to hard-fail, which is an interesting choice to me (as a vocal
fan of timezone aware code), but at the least it won’t fail by
misunderstanding what the server is trying to communicate, and I do understand
and empathize with the situation the <code>apt</code> maintainers are in.</p>
<h2 id="final-thoughts">Final Thoughts</h2>
<p>Overall, my main point is although most modern developers know how to deal
with Unicode pain, I think there is a more general lesson to learn – namely,
you should always know what data you have, and always remember what it is.
Understand assumptions as early as you can, and always store them with the data.</p> Docker PostgreSQL Foreign Data Wrapper https://notes.pault.ag/dockerfdw/Thu, 18 Sep 2014 21:49:00 -0500 https://notes.pault.ag/dockerfdw/ <p>For the tl;dr: <a href="https://github.com/paultag/dockerfdw">Docker FDW</a> is a thing.
Star it, hack it, try it out. File bugs, be happy. If you want to see what it’s
like to read, there’s some example SQL down below.</p>
<aside class="left">
This post was edited on Sep 21st to add information about the
<code>DELETE</code> and <code>INSERT</code> operators
</aside>
<p>The question is first, what the heck is a PostgreSQL Foreign Data Wrapper?
PostgreSQL Foreign Data Wrappers are plugins that allow C libraries
to provide an adaptor for PostgreSQL to talk to an external database.</p>
<p>Some folks have used this to wrap stuff like
<a href="https://github.com/citusdata/mongo_fdw">MongoDB</a>, which I always found
to be hilarous (and an epic hack).</p>
<h1 id="enter-multicorn">Enter Multicorn</h1>
<p>During my time at <a href="https://pygotham.org/">PyGotham</a>, I saw a talk from
<a href="https://twitter.com/weschow">Wes Chow</a> about something called
<a href="https://multicorn.org/">Multicorn</a>. He was showing off some really neat
plugins, such as the git revision history of CPython, and parsed logfiles
from some stuff over at Chartbeat. This basically blew my mind.</p>
<aside class="right">
If you're interested in some of these, there are a bunch in the
Multicorn VCS repo, such as the
<a href="https://github.com/Kozea/Multicorn/blob/master/python/multicorn/gitfdw.py">gitfdw</a>
example.
</aside>
<p>All throughout the talk I was coming up with all sorts of things that I wanted
to do – this whole library is basically exactly what I’ve been dreaming
about for years. I’ve always wanted to provide a SQL-like interface
into querying API data, joining data cross-API using common crosswalks,
such as using <a href="https://capitolwords.org/">Capitol Words</a> to query for
Legislators, and use the
<a href="https://bioguide.congress.gov/biosearch/biosearch.asp">bioguide ids</a>
to <code>JOIN</code> against the <a href="https://sunlightlabs.github.io/congress/">congress api</a>
to get their Twitter account names.</p>
<p>My first shot was to Multicorn the new
<a href="https://opencivicdata.org/">Open Civic Data</a> API I was working on, chuckled
and put it aside as a really awesome hack.</p>
<h1 id="enter-docker">Enter Docker</h1>
<p>It wasn’t until <a href="https://github.com/tianon">tianon</a> connected the dots for me
and suggested a <a href="https://docker.io/">Docker</a> FDW did I get really excited.
Cue a few hours of hacking, and I’m proud to say – here’s
<a href="https://github.com/paultag/dockerfdw">Docker FDW</a>.</p>
<p>This lets us ask all sorts of really interesting questions out of the API,
and might even help folks writing webapps avoid adding too much Docker-aware
logic. Abstractions can be fun!</p>
<h1 id="setting-it-up">Setting it up</h1>
<aside class="left">
The only stumbling block you might find (at least on Debian and Ubuntu) is
that you'll need a Multicorn `.deb`. It's currently undergoing an
official Debianization from the Postgres team, but in the meantime I put
the source and binary up on my
<a href="https://people.debian.org/~paultag/tmp/">people.debian.org</a>.
Feel free to use that while the Debian PostgreSQL team prepares the upload
to unstable.
</aside>
<p>I’m going to assume you have a working Multicorn, PostgreSQL and Docker setup
(including adding the <code>postgres</code> user to the <code>docker</code> group)</p>
<p>So, now let’s pop open a <code>psql</code> session. Create a database (I called mine
<code>dockerfdw</code>, but it can be anything), and let’s create some tables.</p>
<p>Before we create the tables, we need to let PostgreSQL know where our
objects are. This takes a name for the <code>server</code>, and the <code>Python</code> importable
path to our FDW.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-sql" data-lang="sql"><span style="display:flex;"><span><span style="color:#66d9ef">CREATE</span> SERVER docker_containers <span style="color:#66d9ef">FOREIGN</span> <span style="color:#66d9ef">DATA</span> WRAPPER multicorn <span style="color:#66d9ef">options</span> (
</span></span><span style="display:flex;"><span> wrapper <span style="color:#e6db74">'dockerfdw.wrappers.containers.ContainerFdw'</span>);
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">CREATE</span> SERVER docker_image <span style="color:#66d9ef">FOREIGN</span> <span style="color:#66d9ef">DATA</span> WRAPPER multicorn <span style="color:#66d9ef">options</span> (
</span></span><span style="display:flex;"><span> wrapper <span style="color:#e6db74">'dockerfdw.wrappers.images.ImageFdw'</span>);
</span></span></code></pre></div><p>Now that we have the server in place, we can tell PostgreSQL to create a table
backed by the FDW by creating a foreign table. I won’t go too much into the
syntax here, but you might also note that we pass in some options - these are
passed to the constructor of the FDW, letting us set stuff like the Docker
host.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-sql" data-lang="sql"><span style="display:flex;"><span><span style="color:#66d9ef">CREATE</span> <span style="color:#66d9ef">foreign</span> <span style="color:#66d9ef">table</span> docker_containers (
</span></span><span style="display:flex;"><span> <span style="color:#e6db74">"id"</span> TEXT,
</span></span><span style="display:flex;"><span> <span style="color:#e6db74">"image"</span> TEXT,
</span></span><span style="display:flex;"><span> <span style="color:#e6db74">"name"</span> TEXT,
</span></span><span style="display:flex;"><span> <span style="color:#e6db74">"names"</span> TEXT[],
</span></span><span style="display:flex;"><span> <span style="color:#e6db74">"privileged"</span> BOOLEAN,
</span></span><span style="display:flex;"><span> <span style="color:#e6db74">"ip"</span> TEXT,
</span></span><span style="display:flex;"><span> <span style="color:#e6db74">"bridge"</span> TEXT,
</span></span><span style="display:flex;"><span> <span style="color:#e6db74">"running"</span> BOOLEAN,
</span></span><span style="display:flex;"><span> <span style="color:#e6db74">"pid"</span> INT,
</span></span><span style="display:flex;"><span> <span style="color:#e6db74">"exit_code"</span> INT,
</span></span><span style="display:flex;"><span> <span style="color:#e6db74">"command"</span> TEXT[]
</span></span><span style="display:flex;"><span>) server docker_containers <span style="color:#66d9ef">options</span> (
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">host</span> <span style="color:#e6db74">'unix:///run/docker.sock'</span>
</span></span><span style="display:flex;"><span>);
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">CREATE</span> <span style="color:#66d9ef">foreign</span> <span style="color:#66d9ef">table</span> docker_images (
</span></span><span style="display:flex;"><span> <span style="color:#e6db74">"id"</span> TEXT,
</span></span><span style="display:flex;"><span> <span style="color:#e6db74">"architecture"</span> TEXT,
</span></span><span style="display:flex;"><span> <span style="color:#e6db74">"author"</span> TEXT,
</span></span><span style="display:flex;"><span> <span style="color:#e6db74">"comment"</span> TEXT,
</span></span><span style="display:flex;"><span> <span style="color:#e6db74">"parent"</span> TEXT,
</span></span><span style="display:flex;"><span> <span style="color:#e6db74">"tags"</span> TEXT[]
</span></span><span style="display:flex;"><span>) server docker_image <span style="color:#66d9ef">options</span> (
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">host</span> <span style="color:#e6db74">'unix:///run/docker.sock'</span>
</span></span><span style="display:flex;"><span>);
</span></span></code></pre></div><p>And, now that we have tables in place, we can try to learn something about the
Docker containers. Let’s start with something fun - a join from containers
to images, showing all image tag names, the container names and the ip of the
container (if it has one!).</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-sql" data-lang="sql"><span style="display:flex;"><span><span style="color:#66d9ef">SELECT</span> docker_containers.ip, docker_containers.<span style="color:#66d9ef">names</span>, docker_images.tags
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">FROM</span> docker_containers
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">RIGHT</span> <span style="color:#66d9ef">JOIN</span> docker_images
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">ON</span> docker_containers.image<span style="color:#f92672">=</span>docker_images.id;
</span></span></code></pre></div><pre tabindex="0"><code> ip | names | tags
-------------+-----------------------------+-----------------------------------------
| | {ruby:latest}
| | {paultag/vcs-mirror:latest}
| {/de-openstates-to-ocd} | {sunlightlabs/scrapers-us-state:latest}
| {/ny-openstates-to-ocd} | {sunlightlabs/scrapers-us-state:latest}
| {/ar-openstates-to-ocd} | {sunlightlabs/scrapers-us-state:latest}
172.17.0.47 | {/ms-openstates-to-ocd} | {sunlightlabs/scrapers-us-state:latest}
172.17.0.46 | {/nc-openstates-to-ocd} | {sunlightlabs/scrapers-us-state:latest}
| {/ia-openstates-to-ocd} | {sunlightlabs/scrapers-us-state:latest}
| {/az-openstates-to-ocd} | {sunlightlabs/scrapers-us-state:latest}
| {/oh-openstates-to-ocd} | {sunlightlabs/scrapers-us-state:latest}
| {/va-openstates-to-ocd} | {sunlightlabs/scrapers-us-state:latest}
172.17.0.41 | {/wa-openstates-to-ocd} | {sunlightlabs/scrapers-us-state:latest}
| {/jovial_poincare} | {<none>:<none>}
| {/jolly_goldstine} | {<none>:<none>}
| {/cranky_torvalds} | {<none>:<none>}
| {/backstabbing_wilson} | {<none>:<none>}
| {/desperate_hoover} | {<none>:<none>}
| {/backstabbing_ardinghelli} | {<none>:<none>}
| {/cocky_feynman} | {<none>:<none>}
| | {paultag/postgres:latest}
| | {debian:testing}
| | {paultag/crank:latest}
| | {<none>:<none>}
| | {<none>:<none>}
| {/stupefied_fermat} | {hackerschool/doorbot:latest}
| {/focused_euclid} | {debian:unstable}
| {/focused_babbage} | {debian:unstable}
| {/clever_torvalds} | {debian:unstable}
| {/stoic_tesla} | {debian:unstable}
| {/evil_torvalds} | {debian:unstable}
| {/foo} | {debian:unstable}
(31 rows)
</code></pre><p>OK, let’s see if we can bring this to the next level now. I finally got around
to implementing <code>INSERT</code> and <code>DELETE</code> operations, which turned out to be
pretty simple to do. Check this out:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-sql" data-lang="sql"><span style="display:flex;"><span><span style="color:#66d9ef">DELETE</span> <span style="color:#66d9ef">FROM</span> docker_containers;
</span></span></code></pre></div><pre tabindex="0"><code>DELETE 1
</code></pre><p>This will do a <code>stop</code> + <code>kill</code> after a 10 second hang behind the scenes. It’s
actually a lot of fun to spawn up a container and terminate it from
<code>PostgreSQL</code>.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-sql" data-lang="sql"><span style="display:flex;"><span><span style="color:#66d9ef">INSERT</span> <span style="color:#66d9ef">INTO</span> docker_containers (name, image) <span style="color:#66d9ef">VALUES</span> (<span style="color:#e6db74">'hello'</span>, <span style="color:#e6db74">'debian:unstable'</span>) RETURNING id;
</span></span></code></pre></div><pre tabindex="0"><code> id
------------------------------------------------------------------
0a903dcf5ae10ee1923064e25ab0f46e0debd513f54860beb44b2a187643ff05
INSERT 0 1
(1 row)
</code></pre><p>Spawning containers works too - this is still very immature and not super
practical, but I figure while I’m showing off, I might as well go all the way.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-sql" data-lang="sql"><span style="display:flex;"><span><span style="color:#66d9ef">SELECT</span> ip <span style="color:#66d9ef">FROM</span> docker_containers <span style="color:#66d9ef">WHERE</span> id<span style="color:#f92672">=</span><span style="color:#e6db74">'0a903dcf5ae10ee1923064e25ab0f46e0debd513f54860beb44b2a187643ff05'</span>;
</span></span></code></pre></div><pre tabindex="0"><code> ip
-------------
172.17.0.12
(1 row)
</code></pre><p>Success! This is just a taste of what’s to come, so please feel free to hack on
<a href="https://github.com/paultag/dockerfdw">Docker FDW</a>,
tweet me <a href="https://twitter.com/paultag">@paultag</a>, file bugs / feature requests.
It’s currently a bit of a hack, and it’s something that I think has
long-term potential after some work goes into making sure that this is a rock
solid interface to the Docker API.</p> Linode pv-grub chaining https://notes.pault.ag/linode-pv-grub-chainning/Sat, 14 Jun 2014 21:40:00 -0500 https://notes.pault.ag/linode-pv-grub-chainning/ <p>I’ve been using <a href="https://linode.com">Linode</a> since 2010, and many of
my friends have heard me talk about how big a fan I am of linode. I’ve
used Debian unstable on all my Linodes, since I often use them as a remote
shell for general purpose Debian development. I’ve found my linodes to be
indispensable, and I really love Linode.</p>
<h1 id="the-problem">The Problem</h1>
<p>Recently, because of my work on <a href="https://docker.io/">Docker</a>, I was forced
to stop using the Linode kernel in favor of the stock Debian kernel, since
the stock Linode kernel has no aufs support, and the default LVM-based
devicemapper backend can be quite a pain.</p>
<aside class="left">
The btrfs errors are ones I fully expect to be gone soon, I can't wait
to switch back to using it.
</aside>
<p>I tried loading in <a href="https://en.wikipedia.org/wiki/Btrfs">btrfs</a> support, and
using that to host the Docker instance backed with btrfs, but it was throwing
errors as well. Stuck with unstable backends, I wanted to use the
<a href="https://en.wikipedia.org/wiki/Aufs">aufs</a> backend, which, despite problems in
aufs internally, is quite stable with Docker (and in general).</p>
<p>I started to run through the <a href="https://library.linode.com/custom-instances/pv-grub-howto">Linode Library’s guide on PV-Grub</a>,
but that resulted in a cryptic error with xen not understanding the compression
of the kernel. I checked for recent changes to the compression, and lo, the
Debian kernel has been switched to use xz compression in sid. Awesome news,
really. XZ compression is awesome, and I’ve been super impressed with how
universally we’ve adopted it in Debian. Keep it up! However, it appears only
a newer pv-grub than the Linode hosts have installed will fix this.</p>
<p>After contacting the (ever friendly) Linode support, they were unable to give
me a timeline on adding xz support, which would entail upgrading pv-grub. It
was quite disappointing news, to be honest. Workarounds were suggested,
but I’m not quite happy with them as proper solutions.</p>
<p>After asking in <code>#debian-kernel</code>, <a href="https://bblank.thinkmo.de/blog">waldi</a> was
able to give me a few pointers, and the following is very inspired by him,
the only thing that changed much was config tweaking, which was easy enough.
Thanks, Bastian!</p>
<h1 id="the-constraints">The Constraints</h1>
<p>I wanted to maintain a 100% stock configuration from the kernel up.
When I upgraded my kernel, I wanted to just work. I didn’t want to
unpack and repack the kernel, and I didn’t want to install software
outside main on my system. It had to be 100% Debian and unmodified.</p>
<h1 id="the-solution">The Solution</h1>
<aside class="right">
It's pretty fun to attach to the lish console and watch bootup pass
through GRUB 0.9, to GRUB 2.x to Linux. Free Software, Fuck Yeah.
</aside>
<p>Left unable to run my own kernel directly in the Linode interface, the tact
here was to use Linode’s old pv-grub to chain-load grub-xen, which loaded
a modern kernel. Turns out this works great.</p>
<p>Let’s start by creating a config for Linode’s pv-grub to read
and use.</p>
<pre>
sudo mkdir -p /boot/grub/
</pre>
<p>Now, since pv-grub is legacy grub, we can write out the following
config to chain-load in <code>grub-xen</code> (which is just Grub 2.0, as far as I can
tell) to <code>/boot/grub/menu.lst</code>. And to think, I almost forgot all about
<code>menu.lst</code>. Almost.</p>
<pre>
default 1
timeout 3
title grub-xen shim
root (hd0)
kernel /boot/xen-shim
boot
</pre>
<p>Just like riding a bike! Now, let’s install and set up grub-xen to work for us.</p>
<pre>
sudo apt-get install grub-xen
sudo update-grub
</pre>
<p>And, let’s set the config for the GRUB image we’ll create in the next step
in the <code>/boot/load.cf</code> file:</p>
<pre>
configfile (xen/xvda)/boot/grub/grub.cfg
</pre>
<p>Now, lastly, let’s generate the <code>/boot/xen-shim</code> file that we need
to boot to:</p>
<pre>
grub-mkimage --prefix '(xen/xvda)/boot/grub' -c /boot/load.cf -O x86_64-xen /usr/lib/grub/x86_64-xen/*.mod > /boot/xen-shim
</pre>
<p>Next, change your boot configuration to use <code>pv-grub</code>, and give the machine
a kick. Should work great! If you run into issues, use the lish shell to
debug it, and let me know what else I should include in this post!</p>
<p>Hack on!</p> Hy at PyCon 2014 https://notes.pault.ag/hy-pycon-2014/Fri, 18 Apr 2014 20:13:00 -0500 https://notes.pault.ag/hy-pycon-2014/ <p>I gave a talk this year at <a href="https://us.pycon.org/2014/">PyCon 2014</a>, about one
of my favorite subjects: <a href="https://hylang.org/">Hy</a>. Many of my regular readers
will have no doubt explored Hy’s thriving
<a href="https://github.com/hylang">GitHub org</a>, played with
<a href="https://try-hy.appspot.com/">try-hy</a>, or even installed it locally by
<a href="https://pypi.python.org/pypi/hy">pip installing it</a>. I was lucky enough to
be able to attend PyCon on behalf of <a href="https://sunlightfoundation.com/">Sunlight</a>,
with a solid contingent of my colleagues. We put together a writeup on the
<a href="https://sunlightfoundation.com/blog/2014/04/18/sunlight-at-pycon-2014/">Sunlight blog</a>
if anyone was interested in our favorite talks.</p>
<div class="video" style="width: 500px; height: 340px; margin: 0 auto;">
<iframe width="500" height="315" src="//www.youtube.com/embed/AmMaN1AokTI" frameborder="0" allowfullscreen></iframe>
</div>
<p>Tons of really amazing questions, and such an amazingly warm reception from
so many of my peers throughout this year’s PyCon. Thank you so much to
everyone that attended the talk. As always, you should
<a href="https://github.com/hylang/hy">Fork Hy on GitHub</a>,
follow <a href="https://twitter.com/hylang">@hylang</a> on the twitters, and
send in any bugs you find!</p>
<p>Hopefully I’ll be able to put my talk up in blog-post form soon, but until then
feel free to look over the <a href="https://slides.pault.ag/hy.html">slides</a> or just
<a href="https://www.youtube.com/watch?v=AmMaN1AokTI">watch the talk</a>.</p>
<p>An extra shout-out to <a href="https://twitter.com/akaptur">@akaptur</a> for hacking on
Hy during the sprints, and giving the exception system
<a href="https://github.com/hylang/hy/pull/556">quite the workthrough</a>.
Thanks, Allison!</p> Musings about Debian and Python https://notes.pault.ag/debian-python/Sat, 21 Sep 2013 22:49:00 -0500 https://notes.pault.ag/debian-python/ <p>On a regular basis, I find myself the odd-man-out when it comes to talking
about how to work with Python on Debian systems. I’m going to write this and
post it so that I might be able to point people at my thoughts without having
to write the same email in response to each thread that pops up.</p>
<p>Turns out I don’t fit in with the Debian hardliners (which is to say, the
mindset that <code>pip</code> sucks and shouldn’t exist), nor do I fit in with the Python
hardliners (which is to say <code>apt</code> and <code>dpkg</code> are out of date, and neither have
a place on a Development machine).</p>
<p>I think our discourse on this topic has become <em>petty</em> and <em>stupid</em> in general.
Let’s all try to step back and drop a bit of the attitude.</p>
<h1 id="pip-doesnt-suck-and-neither-does-apt"><code>pip</code> doesn’t suck, and neither does <code>apt</code>.</h1>
<p>The truth is, both sides are wrong. As with any subject, the real
answer here is much more nuanced than either side presents it. I’m going to
try and present my opinion on this, in the way that both my Pythonista self
and my Debianite self see the issue. Hopefully I can keep this short, to
the point, and caked with logic.</p>
<h2 id="the-case-for-dpkg-the-debianite-in-me">The case for <code>dpkg</code> (the Debianite in me)</h2>
<p>In defense of <code>dpkg</code> and <code>apt</code>, imagine having to install <code>python-gnome2</code>
on all your systems when you install. It’d be hell on earth.
Imagine having a <strong>user</strong> try to do this. It’s insane to assume that
end-users will be using <code>pip</code> for this purpose.</p>
<p><code>pip</code> is fun and all, but it’s also installing 100% untrusted code to your
system (perhaps as root, if you’re using <code>pip</code> with <code>sudo</code> for some reason),
and hasn’t been reviewed for software freeness, which is something Debian
(and Debian users) take seriously. This isn’t even to mention the hell that
<code>pip</code> wreaks on <code>dpkg</code> controlled files / packages.</p>
<aside class="left">
Remember, Debian spends a lot of time and effort into ensuring software
is <a href="https://www.debian.org/social_contract#guidelines" >DFSG</a>
free, and safe.
</aside>
<p>Try to remember how much of your system running (yes, right now) is running
because of Python or Python modules. Try to imagine how much of a pain in
the ass it’d be if you couldn’t boot into <code>GNOME</code> to use <code>nm-applet</code> to connect
to wifi to <code>pip</code> install something. I’m sure even the most extreme pip’er
understands the need for Operating System level package management.</p>
<p>Debian also has a bigger problem scope - we’re not maintaining a library
in Debian for kicks, we’re maintaining it so that <em>end user applications</em> may
use the library. When we update something like <code>Django</code>, we have to make sure
that we don’t break anything using it (although, to be honest, the fact that we
package webapps is an entire rant for later) before we get to update it to the
newest release.</p>
<p>Hell, with a few coffees, I could automate the process of releasing a <code>.deb</code>
with a new upstream release, 100% unattended. I won’t, however, since this is
an insane idea. Let’s go over a brief list of things I do before uploading a
new package:</p>
<ol>
<li>Review the <em>entire</em> codebase for simple mistakes.</li>
<li>Review the <em>entire</em> codebase for license issues.</li>
<li>Review the <em>entire</em> codebase for files without source, and track down
(and include source for) any sourceless files (such as <code>pickle</code>
files, etc).</li>
<li>Get to know the upstream, get to know open bugs, write something using
the lib, in case I need to debug later.</li>
<li>Install the package.</li>
<li>Test the package.</li>
<li>Work out any Debian package issues (this is easy).</li>
</ol>
<p>Now, a brief list of things I do before I update a package:</p>
<aside class="right">
Some non-Debian people may call this anal. I disagree, since this is
important to ensure we have <i>source</i> for all files. In addition,
it's trivial to take the next step and ensure things are <i>roughly</i>
safe.
</aside>
<ol>
<li>Review the changes between the last uploaded version (in diff format, if
it’s sane, otherwise get the VCS and review), ensure all the above are still
OK.</li>
<li>Review for Debian-local issues (such as how it will upgrade, using
<code>piuparts</code>, and <code>adequate</code>, etc).</li>
<li>Check to make sure it won’t break any reverse dependencies.</li>
<li>Review for bugfixes that I might need to bring back to the <code>stable</code> release.</li>
<li>Figure out if I should (or even can) backport the package, if API is
stable.</li>
<li>Review for bugs (upstream or in Debian) that I need to mention in the
debian/changelog.</li>
</ol>
<p>Clearly, this isn’t a quick-and-dirty task. It’s not a matter of getting a
package updated (technically), it’s a much more detailed process than that.
This is also why Debian is so highly regarded for its technical virtuosity,
and why the
<a href="https://training.linuxfoundation.org/why-our-linux-training/training-reviews/linux-foundation-training-prepares-the-international-space-station-for-linux-migration">ISS decided to deploy Debian in space</a>,
despite other commercial distros such as <code>Red Hat</code>, or <code>Ubuntu</code>, and
community distros, such as <code>Fedora</code> or <code>Arch</code>.</p>
<aside class="left">
Cheap shot, I know.
</aside>
<p>It’s also not Debian’s job to package the world in the archive. This is an
insane task, and it’s not Debian’s place to do it. We introduce libraries
as things need them, not because we wrote some new library that someone
may find slightly useful at some point in the future. maybe.</p>
<p>Upstream developers and language communities (not only Python here) tend to
lose sight of why we’re doing this in the first place, which
is our users. This isn’t some sort of technical pissing contest to see who can
distribute the software in the best way. Debian-folk always keep end users
as our highest priority.</p>
<aside class="right">
I'm sorry to any
<a href="https://lists.debian.org/20100106100055.GV3438@radis.liafa.jussieu.fr" >kittens that may have been harmed by this statement</a>.
</aside>
<p>I quote the
<a href="https://www.debian.org/social_contract">Debian Social Contract</a>, when I say
that <em>Our priorities are our users and free software</em>. No one’s trying to
get <em>developers</em> to use <code>dpkg</code> to create software. In fact, as you’ll see
below, I actively <em>discourage</em> using system modules for development.</p>
<h2 id="the-case-for-pip-the-pythonista-in-me">The case for <code>pip</code> (the Pythonista in me)</h2>
<p>In defense of <code>pip</code>, the idea that Debian will keep the latest versions of
packages is insane. The idea that we can keep pace with upstream releases is
nuts, and the idea that every upstream release on <code>pypi</code> is ready to ship is
bananas. <a href="https://youtu.be/gZHjRQjbHrE?t=2m30s">b-a-n-a-n-a-n-a-s</a>.
As a developer, I don’t want to support every release, and I surely don’t want
other people depending on some random snapshot.</p>
<aside class="right">
In fact, I have a very hard time saying anything but <i>"try upgrading
first"</i> when I get a bug report on a side-project.
It's tough to remember some edge-case from 2 years ago if this code is
tightly coupled with another codebase.
</aside>
<p>Often times, I’ll put stuff up on <code>pypi</code> as a preview, or to release often, and
solicit feedback without having to give out instructions on using a <code>git</code>
checkout (it’s also easier to have them try a version from <code>pypi</code> so I can
cross-ref the git tag to reproduce issues when they file them)</p>
<aside class="left">
Even Debian tools I write, like
<a href="https://pypi.python.org/pypi/schroot">python-schroot</a>
are released to <code>pypi</code> first, and I treat that as the
upstream location when packaging it in Debian.
</aside>
<p><code>pypi</code> is easy, ubiquitous and works regardless of the platform, which means
less of my development time is spent packaging stuff up for platforms I don’t
really care about (<code>Arch</code>, <code>Fedora</code>, <code>OSX</code>, <code>Windows</code>), even though I value
feedback from users on those systems. The effort it takes to release something
is limited to <code>python setup.py sdist upload</code>, and it’s in a place (and in a
shape) that anyone can use it without having 10 sets of platform-local
instructions.</p>
<p>Even ignoring all the above, when <em>I’m</em> writing a new app or bit of code,
I want to be sure I’m targeting the latest version of the code I depend on,
so that future changes to API won’t hit me as hard. By following
along with my dependencies’ development, I can be sure that my code breaks
early, and breaks in development, not production. Upstreams also tend to not
like bug reports against old branches, so ensuring I have the latest code from
<code>pypi</code> means I can properly file bugs.</p>
<p>Lastly, I prefer <code>virtualenv</code> based setups for development, since I’m usually
working on many things at once. This often means version mismatches in
libraries, which brings in API changes (another whole rant here as well).
I <em>don’t</em> want to keep installing and uninstalling packages to switch between
the two projects, and using a <code>chroot(8)</code> means a lot of overhead and that it’s
disconnected from my development environment / filesystem, so I resort to
<code>virtualenv</code> to isolate my Development environment.</p>
<h1 id="final-notes">Final notes</h1>
<aside class="right">
I love apt, I love pip, why can't you?
</aside>
<p>I don’t want to keep arguing about this. Just accept that the world’s a big
place and that there exist use-cases that both <code>apt</code> and <code>pip</code> need to exist
and work in the way they’re working now. At the very least, try and understand
there exist smart people on both sides, and no one is trying to screw anyone
over or keep their own little private club to themselves. Hopefully, going
forward, we can make sure that the integration between these two tools gets
<em>better</em>, not worse.</p>
<p>Help make this dream a reality. Contribute to a productive tone, not a
destructive one. In short:</p>
<ul>
<li>Use <code>pip</code> without <code>sudo</code> always. Don’t tell people to use <code>sudo</code>.</li>
<li>Use <code>apt</code> or <code>dpkg</code> when deploying system-wide.</li>
<li>Understand people are going to package, and they will be more concerned
about software using your library then keeping your library up to date.</li>
<li>Understand Debian Developers and package maintainers have to do a lot of
work when updating or sponsoring an upload.</li>
<li>Understand upstream developers can’t be bothered to fix every issue
with every release (release early, release often) with some snapshot
you introduced into unstable.</li>
<li>Use <code>pip</code> and <code>virtualenv</code> in development setups, so we can upgrade your
app when we upgrade the lib.</li>
</ul> Hy: The survival guide https://notes.pault.ag/hy-survival-guide/Fri, 02 Aug 2013 23:19:00 -0500 https://notes.pault.ag/hy-survival-guide/ <p>One of my new favorite languages is a peppy little
<a href="https://en.wikipedia.org/wiki/Lisp">lisp</a> called
<a href="https://hylang.org">hy</a>. I like it a lot since it’s a result of a hilarious
idea I had while talking with some coworkers over Mexican food. Since I’m
the most experienced <a href="https://github.com/hylang?tab=members">Hypster</a> on the
planet, I figured I should write a survival guide. This will go a lot easier
if you already know Lisp, but you can get away with quite a bit of Python.</p>
<h1 id="the-tao-of-hy">The Tao of Hy</h1>
<p>We don’t have many rules (yet), but we do have quite a bit of philosophy.
The collective Hyve Mind has spent quite a bit of time working out Hy’s
internals, and we do spend a bit of time looking at how the language “feels”.
The following is a brief list of some of the design decisions we’ve
picked out.</p>
<ol>
<li>Look like a lisp, <code>DTRT</code> with it (e.g. dashes turn to underscores,
earmuffs turn to all-caps.)</li>
<li>We’re still Python. Most of the internals translate 1:1 to Python
internals.</li>
<li>Use unicode <em>everywhere</em>.</li>
<li>Tests or it doesn’t exist.</li>
<li>Fix the bad decisions in Python 2 when we can (see <code>true_division</code>)</li>
<li>When in doubt, defer to Python.</li>
<li>If you’re still unsure, defer to Clojure</li>
<li>If you’re even more unsure, defer to Common Lisp</li>
<li>Keep in mind we’re <em>not</em> Clojure. We’re <em>not</em> Common Lisp. We’re Homoiconic
Python, with extra bits that make sense.</li>
</ol>
<p>Naturally, this doesn’t cover everything, but if you can drop into that mindset,
things start to make quite a bit of sense.</p>
<h1 id="the-style-of-hy">The Style of Hy</h1>
<p>Although I am perhaps the least qualified person to do so (I still don’t write
idiomatic Lisp all the time), I’m going to set up a few ground-rules when it
comes to idiomatic Hy code. We borrow quite a bit of syntax from Common Lisp
and Clojure, so again, feel free to defer to either if you’re not working
on Hy internals. I prefer the
<a href="https://github.com/bbatsov/clojure-style-guide">Clojure Style Guidelines</a>
myself. As such, these are what we will defer to in the case that the Hy
style is undefined.</p>
<h2 id="clojure-isms">Clojure-isms</h2>
<p>Hy has quite a few Clojure-isms that I rather prefer, such as the threading
macro, and dot-notation (for accessing methods on an Object), which I would
rather see used throughout the hylands.</p>
<pre><code>:::clojure
;; good:
(with [fd (open "/etc/passwd")]
(print (.readlines fd)))
;; bad:
(with [fd (open "/etc/passwd")]
(print (fd.readlines)))
</code></pre>
<p>Some <a href="https://dustycloud.org/">other hy devs</a> very much disagree, and there’s
nothing syntactically invalid about the latter, and it will continue to be
supported (in fact, it makes some things easier!), but it will not be
considered for Hy internal code.</p>
<p>We also very much encourage use of the <code>threading macro</code> throughout code
where it makes sense.</p>
<pre><code>:::clojure
;; good:
(import [sh [cat grep]])
(-> (cat "/usr/share/dict/words") (grep "-E" "tag$"))
;; bad:
(import [sh [cat grep]])
(grep (cat "/usr/share/dict/words") "-E" "tag$")
</code></pre>
<p>However, do use it when it helps aid in clarity, like all things, there are
cases where it makes a mess out of something that ought to not be futzed with.</p>
<h2 id="python-isms">Python-isms</h2>
<p>In addition to stealing quite a bit of syntax from Clojure, I’m going to
take a few Python rules from PEP8 that apply to Hy as well. These are taken
because PEP8 is a really great set of rules, and Hy code ends up pretty,
well, Pythonic. The following are a collection of Pythonic rules that
explicitly apply to Hy code.</p>
<p>Trailing spaces is a huge one. Never ever ever shall it be OK to have
trailing spaces on internal Hy code. For they suck.</p>
<p>As with Python, you shall always double-space module-level definitions if
separated with a newline.</p>
<p>All public functions must always contain docstrings.</p>
<p>Inline comments shall be <em>two</em> spaces from the end of the code, if they
are inline comments. They must always have a space between the comment
character and the start of the comment.</p>
<h2 id="hy-isms">Hy-isms</h2>
<p>Indentation shall be two spaces, except where matching the indentation
of the previous line.</p>
<pre>
;; good (and preferred):
(defn fib [n]
(if (<= n 2)
n
(+ (fib (- n 1)) (fib (- n 2)))))
;; still OK:
(defn fib [n]
(if (<= n 2) n (+ (fib (- n 1)) (fib (- n 2)))))
;; still OK:
(defn fib [n]
(if (<= n 2)
n
(+ (fib (- n 1)) (fib (- n 2)))))
;; Stupid as hell
(defn fib [n]
(if (<= n 2)
n
(+ (fib (- n 1)) (fib (- n 2)))))
</pre>
<p>Parens must <em>never</em> be alone, sad, all by their lonesome on their own line.</p>
<pre>
;; good (and preferred):
(defn fib [n]
(if (<= n 2)
n
(+ (fib (- n 1)) (fib (- n 2)))))
;; Stupid as hell
(defn fib [n]
(if (<= n 2)
n
(+ (fib (- n 1)) (fib (- n 2)))
)
) ; GAH, BURN IT WITH FIRE
</pre>
<p>Don’t use S-Expression syntax where vector syntax is really required. For
instance, the fact that:</p>
<pre>
;; bad (and evil)
(defn foo (x) (print x))
(foo 1)
</pre>
<p>works is just because the compiler isn’t overly strict. In reality, the
correct syntax in places such as this is:</p>
<pre>
;; good (and preferred):
(defn foo [x] (print x))
(foo 1)
</pre>
<h1 id="notice">Notice</h1>
<p>This guide is, above all, a <em>guide</em>. This is also only truly binding
for working on Hy code internally. This post is also super subject to change
in the future, whenever I can be bothered to ensure that we have more of the
de facto rules written down.</p> Automatically lint your packages with debuild.me https://notes.pault.ag/debuild-me/Sun, 09 Jun 2013 17:43:00 -0500 https://notes.pault.ag/debuild-me/ <p>Over my time working with Debian packages, I’ve always been concerned that
I have been missing catchable mistakes by not running all the static checking
tools I could run. As a result, I’ve been interested in writing some code that
automates this process, a place where I can push a package and come back a few
hours later to check on the results. This is great, since it provides a slightly
less scary interface to new packagers, and helps avoid thinking they’ve
just been “told off” by a Developer.</p>
<p>I’ve spent the time to actually write this code, and I’ve called it
<a href="https://debuild.me">debuild.me</a>. The code itself is in its fourth
iteration, and is built up from a few core components. The client / server code
(<a href="https://github.com/paultag/lucy">lucy</a> and
<a href="https://github.com/paultag/ethel">ethel</a>) are quite interconnected, but
<a href="https://github.com/fedora-static-analysis/firehose">firehose</a> works great
on its own, and is a single, unified (and sane!) spec that is easy
to hack with (or even on!). Hopefully, this means that our wrappers will be
usable outside of debuild.me, which is a win for everyone.</p>
<h1 id="backend-design">Backend Design</h1>
<p>The backend (<a href="https://github.com/paultag/lucy">lucy</a>) was the first part
I wanted to design. I made the decision (very early on) that everything was
going to be 100% Python 3.3+. This lets me use some of the (frankly, sweet)
tools in the stdlib. Since I’ve written this type of thing before
(I’ve tried to write this tool <a href="https://github.com/paultag/monomoy-old">many</a>,
<a href="https://github.com/paultag/monomoy">many</a>,
<a href="https://github.com/paultag/chatham-old">many</a>,
<a href="https://github.com/paultag/chatham">many</a> times before), so I had a rough
sense of how I wanted to design the backend. Past iterations had suffered from
an overly complex server half, so I decided to go ultra minimal with the
design of debuild.me.</p>
<aside class='left'>
You can find the code for the server (lucy) on
<a href="https://github.com/paultag/lucy">my GitHub</a>
</aside>
<p>The backend watches a directory (using a simple <code>inotify</code> script) and processes
<code>.changes</code> files as they come in. If the package is a source package, a set of
jobs are triggered (such as <code>lintian</code>, <code>build</code> and <code>desktop-file-validate</code>),
as well as a different set for binary packages (such as <code>lintian</code>, <code>piuparts</code>
and <code>adequite</code>). Only people may upload source packages (without any debs) and
only builders can upload binary packages (without source).</p>
<p>The client and server talk using
<a href="https://docs.python.org/3/library/xmlrpc.server.html">XML-RPC</a> with BASIC HTTP
auth. I’m going to (eventually) SSL secure the transport layer, but for now,
this will work as a proof of concept.</p>
<p>Since I tend to like to keep my codebase simple and straightforward, I’ve used
<a href="https://www.mongodb.org/">MongoDB</a> as Lucy’s DB. This lets me move between
documents in Mongo to Python objects without any trouble. In addition, I
evaluated some of the queue code out there (ZMQ, etc), and they all seemed
like overkill for my problem, and had a hard time keeping track of jobs that
(must never!) get lost. As a result, I wrote my own (very simple) job queue
in Mongo, which has no sense of scheduling (at all), but can do its job (and
do it well).</p>
<p>Jobs describe what’s to be built with a link to the <code>package</code> document
that the job relates to, and its <code>arch</code> and <code>suite</code> (don’t worry about the
rest just yet). Jobs get assigned via natural sort on its <code>UUID</code> based <code>_id</code>,
and assigned to the first builder that can process its <code>arch</code> / <code>suite</code>.
Source packages are considered <code>arch:all</code> / <code>suite:unstable</code> (so they always
get the most up-to-date linters on any arch that comes along).</p>
<p>Lucy also allows for uploads to be given an <code>X-Lucy-Group</code> tag to manage which
set of packages they’re a part of. This comes in handy for doing partial
archive rebuilds, or eventually using it to manage what jobs should be run
on which uploads. This will allow me to run much more time-consuming tools
for packages I want to review versus rebuilding to ensure packages don’t
FTBFS or aren’t adequite.</p>
<h1 id="client-design">Client Design</h1>
<p>The buildd client (<a href="https://github.com/paultag/ethel">ethel</a>) talks with <code>lucy</code>
via <code>XML-RPC</code> to get assigned new jobs, release old jobs, close finished jobs,
and upload package report data. When the <code>etheld</code> requests a new job, it also
passes along what <code>suites</code> it knows of, which <code>arches</code> it can build, as well
as what <code>types</code> it can run (stuff like <code>lintian</code>, <code>build</code> or <code>cppcheck</code>.) Lucy
then assigns the builder to that job (so that we don’t allocate the same job
twice), and what time it was assigned at.</p>
<aside class='right'>
You can find the code for the client (ethel) on
<a href="https://github.com/paultag/ethel">my GitHub</a>
</aside>
<p>Ethel then takes the result of the job (in the form of a <code>firehose.model</code> tree)
and transmits it over the line back to the Lucy server as a <code>report</code> (which also
contains information on if the build failed or not), at which point
lucy hands back a location (on the lucy host) that the daemon can write the log
to.</p>
<p>If the job was a binary build, the <code>etheld</code> process will <code>dput</code> the package to
the server, with a special <code>X-Lucy-Job</code> tag to signal which job that build
relates to, so that future lint runs can fetch the <code>deb</code> files that the build
produced.</p>
<h1 id="tooling">Tooling</h1>
<p>Ethel runs a set of static checkers on the source code, which are basically
fancy wrappers around the tools we all know and love (like
<a href="https://lintian.debian.org/">lintian</a>,
<a href="https://freedesktop.org/wiki/Software/desktop-file-utils/">desktop-file-validate</a>,
or <a href="https://piuparts.debian.org/">piuparts</a>) which output Firehose in place of
home-grown stdout. This allows us to programmatically deal with the output
of these tools in a normal and consistent way.</p>
<aside class='left'>
You can read more about Firehose over in the Firehose
<a href="https://github.com/fedora-static-analysis/firehose/blob/master/README.rst">README.rst</a>
</aside>
<p>Some of the more complex runners are made of 3 parts - a <code>runner</code>, <code>wrapper</code>
and <code>command</code>. The server invokes the <code>command</code> routine, which invokes the
<code>runner</code> (the command just provides a unified interface to all the runners),
who’s output gets parsed by the <code>wrapper</code> to turn it into a Firehose model
tree.</p>
<p>The goal here is that tons of very quick-running tools get run over a
distributed network, and machine-readable reports get filed in a central
location to aid in reviewing packages.</p>
<h1 id="ricky">Ricky</h1>
<p>In addition to the actual code to run builds, I’ve worked on a few tools to
aid with using debuild.me for my DD related life. I have some uncommon
use-cases that are nice to support. One such use-case is the ability to rebuild
packages from the archive (unmodified) to check that they rebuild OK against
the target. This is handy for things like <code>arch:all</code> packages that get
uploaded (since they never get rebuilt on the buildd machines, and FTBFSs are
sadly common) or packages that have had a <code>Build-Dependency</code> change on them.</p>
<p>Ricky is able to create a <code>.dsc</code> url to your friendly local mirror, and fetch
that exact version of the package. Ricky can then also use the <code>.dsc</code> (in a
monumental hack) to forge a <code>package_version_source.changes</code> file, and sign
it with an autobuild key and upload it to the debuild.me instance. Since it
can also modify the <code>.changes</code>’s target distribution, you can also use this to
test if a package will build on <code>stable</code> or <code>testing</code>, unmodified.</p>
<h1 id="fred">Fred</h1>
<p>Fred is a wrapper around Ricky, to help with fetching packages that may not
exist yet. Fred also contains an email scraper to read off such lists as
<a href="https://lists.debian.org/debian-devel-changes">debian-devel-changes</a>, and
add an entry to fetch that upload when it becomes available on the local
mirror, pass it to <code>ricky</code>, and allow debuild.me to rebuild new packages
that match a set of criteria.</p>
<p>I’m currently playing around with the idea of rebuilding all incoming
Python packages to ensure they don’t FTBFS in a clean chroot.</p>
<h1 id="loofah">Loofah</h1>
<p>Loofah is also another wrapper around Ricky, but for use manually. Loofah
is able to sync down the apt <code>Sources</code> list, and place it in Mongo for fast
queries. This than allows me to manually run rebuilds on any Source package
that fits a set of criteria (written in the form of a Mongo query), which get
pulled and uploaded by <code>Ricky</code>.</p>
<p>An example script to rebuild any packages that <code>Build-Depend</code> on
<code>python3-all-dev</code> in Debian <code>unstable</code> / <code>main</code> would look like:</p>
<aside class='right'>
You can find more queries in the Loofah
<a href = 'https://github.com/paultag/loofah/tree/master/eg' >examples</a>
</aside>
<pre>
[
{ "version": "unstable", "suite": "main" },
{ "Build-Depends": "python3-all-dev" }
]
</pre>
<p>Or, a script to rebuild any package that depends on CDBS:</p>
<pre>
[
{},
{"$or": [{"Build-Depends": "cdbs"},
{"Build-Depends-Indep": "cdbs"}]}
]
</pre>
<p>You can use anything that exists in the <code>Sources.gz</code> file to query off of (
including <code>Maintainer</code>!)</p>
<h1 id="future-work">Future Work</h1>
<p>The future work on debuild.me will be centered around making it easier for
buildd nodes to be added to the network, with more and more automation in that
process (likely in the form of debs). I also want to add better control over
the jobs, so that packages I upload only go to my personal servers.</p>
<p>I’d also very much like to get better EC2 / Virtualization support integrated
into the network, so that the buildd count grows with the queue size. This
is a slightly hard problem that I’m keen to fix.</p>
<p>I’m also considering moving the log parsing code <em>out</em> of the workers, so that
the parsing code can be fixed without upgrading all the workers. This would also
drop the <code>Firehose</code> dep on the client code, which would be nice.</p>
<p>Migration from a debuild.me build into a local <code>reprepro</code> repo is something
that would be fairly easy to do as well, likely to be done remotely via
the <code>XML-RPC</code> interface, which calls a couple of <code>reprepro</code> commands (such as
<code>includedsc</code> and <code>includedeb</code>) and publishes it to the user’s repo. This is
a nice use of the debs that get built, and could also allow debuild.me to be
used like a PPA system, but this allows the user to <em>not</em> migrate packages
that may contain <code>piuparts</code> issues.</p> A primer on apt's mirror:// protocol https://notes.pault.ag/apt-mirror/Sat, 23 Feb 2013 21:04:00 -0500 https://notes.pault.ag/apt-mirror/ <p>It’s sometimes helpful to keep your machines using a list of apt archives
to use, rather then a single mirror, because redundancy is good. Rather then
using (the great) services like <code>http.debian.net</code> or <code>ftp.us.debian.org</code>,
you can set your own mirror lists using apt’s <code>mirror://</code> protocol.</p>
<aside class="right">
While initially hacking this through, Micah ended up
filing a bug on <code>mirror://</code>, more information in
<a href="https://bugs.debian.org/699310">the bts</a>. I've since been
able to get it to work for me, but beware!
</aside>
<p>All of this is ultra unstable, so be a bit careful when using this. I’ve been
using <code>mirror://</code> for a few months now, and it seems fine (even have my servers
using it), but it was a bit of a pain to set up. It gets slightly confused if
you point it at something bad, and it’s a mild pain to debug. Hopefully
more people will see the value in <code>mirror://</code>, and contribute code to it’s
development.</p>
<h1 id="why-bother">Why bother?</h1>
<p>If you have a local network mirror, it’s helpful to have your machines default
to a local mirror, if you’re the sort to keep an archive mirror on the LAN,
and fall back to your nearest friendly mirror otherwise. In addition,
this lets you hand-define where apt searches for mirrors, which is great, since
you can control the subset of servers you ping a bit more closely.</p>
<h1 id="practical-bits--quickstart">Practical Bits / quickstart</h1>
<p>The following block covers the quick and dirty details on how to set up
<code>mirror://</code> for use on your machine (today!). This is very basic, and
details are very sparse, but hopefully there’s enough here to help folks
use this on their local system. Basically, you’ve got three core things to do:</p>
<ol>
<li>Pick your mirrors (this one’s a bit of a duh)</li>
<li>Put them in a public place you can always get to, regardless of
where you are in cyberspace (I use
<a href="https://static.pault.ag/debian/mirrors.txt">static.pault.ag</a>) - remember,
this is the one thing all your machines need to always get to, no matter
where they are.</li>
<li>Configure your <code>sources.list</code> to use the mirror.txt file by pointing
to the text file with the <code>mirror://</code> protocol.</li>
</ol>
<p>Turns out <code>mirror://</code>’s protocol handler will segfault if you give it
something bad, so don’t be afraid if you see <code>apt-get update</code> segfault - it
just means you’ve likely not pointed it at a valid text file. The format of
the text file should be a simple text file of mirrors it can try, in
order of priority. Mine looks a bit like:</p>
<pre>
https://127.0.0.1:3142/debian.lcs.mit.edu/debian/
https://debian.lcs.mit.edu/debian/
# https://http.debian.net/debian/
</pre>
<p>Finally, your <code>sources.list</code> entry should look a bit like:</p>
<pre>
deb mirror://static.pault.ag/debian/mirrors.txt unstable main
deb mirror://static.pault.ag/debian/mirrors.txt experimental main
deb-src mirror://static.pault.ag/debian/mirrors.txt unstable main
deb-src mirror://static.pault.ag/debian/mirrors.txt experimental main
</pre>
<h1 id="problems">Problems</h1>
<p>With the good comes the bad. Not everything fully supports this, and most
tools that parse <code>sources.list</code> break in a really silly way.</p>
<h2 id="command-not-found">command-not-found</h2>
<p><code>update-command-not-found</code> will blow up like:</p>
<pre>
W: Don't know how to handle mirror
W: Don't know how to handle mirror
W: Don't know how to handle mirror
W: Don't know how to handle mirror
W: Don't know how to handle mirror
W: Don't know how to handle mirror
</pre> Using env(1) in the shebang https://notes.pault.ag/env-in-shebang/Tue, 15 Jan 2013 20:02:00 -0500 https://notes.pault.ag/env-in-shebang/ <p>Some of you out there may have tried to pass flags to a script that was being
invoked via <code>/usr/bin/env</code> in the shebang (<code>#!</code>), such as <code>python</code>. You might
recall an error such as:</p>
<pre>
/usr/bin/env: python -d: No such file or directory
</pre>
<p>This error is super annoying, so I went about trying to figure out how
I can pass arguments to <code>python</code> (or even things like <code>ipython</code> or <code>bpython</code>).</p>
<p>The idea is we can abuse the concept of a
<a href="https://en.wikipedia.org/wiki/Polyglot_(computing)">polygot</a> to shim in some
things we care about.</p>
<h1 id="implementation">Implementation</h1>
<p>Let’s take a look at a quick script I hacked up to use bpython with a pre-made
script that drops into interactive work.</p>
<pre>
#!/bin/sh
"""":
exec /usr/bin/env bpython -i $0 $@
"""
import hy
print "Hython is now importable!"
</pre>
<p>Let’s step through this slowly. First, the bits the <code>bash</code> sees:</p>
<pre>
#!/bin/sh
"""":
exec /usr/bin/env bpython -i $0 $@
</pre>
<p>Which will cause <code>bpython</code> to reload the file, which looks like the following
to Python:</p>
<pre>
#!/bin/sh
"""":
exec /usr/bin/env bpython -i $0 $@
"""
import hy
print "Hython is now importable!"
</pre>
<p>Where Python can now ignore the docstring. Magic!</p>