Chris McCormick - News - workblog https://mccormick.cx/news/tags/workblog Chris McCormick - News - workblog en Copyright 2008- Chris McCormick 60 GMT chris@mccormick.cx mccormick.cx/news/ What I learned bootstrapping side projects in 2020 entries/what-i-learned-bootstrapping-side-projects-in-2020 https://mccormick.cx/news/entries/what-i-learned-bootstrapping-side-projects-in-2020 In 2020 I made $693 USD from side projects. I contributed to several open source projects and made about a thousand commits on GitHub. I shipped a commercial game, a SaaS product, an IDE, two music apps, a handful of open source utilities, and a couple of websites. I love building stuff!

What I learned:

  • Find demand before building something.
  • Single-pay products are easier than SaaS.
  • Build in public to boost the launch.
  • Ongoing updates help people find the thing.
  • Build websites to help people to find the thing.
  • Being open source does not hurt sales.

The side project income mostly came from:

The SaaS products I worked on failed to generate more than a few cheeseburgers. 🍔 Good thing I love cheeseburgers. If you're doing a first time project and you want some quick wins then probably don't start with SaaS.

Most games follow a typical pattern. They earn all the money in a big burst at the start and then the income falls off to zero. I've heard this from talks by other indie game developers as well. There is a novelty factor in games. This was true for my game too.

I mainly got the word out on Asterogue by building the game in public. What that means is I posted in multiple game dev forums about progress as I was making the game. It was a big effort and the launch went alright as a result (as opposed to most projects which launch to crickets). My fastest sales were from Asterogue but they've fallen to zero now.

The two music apps have done better. I've had 3 months of steady sales from them. They're making a combined revenue of about $80 USD per month.

Beat Maker 3-month revenue

Beat Maker 3 month revenue AUD

PO LoopSync 3-month revenue

PO LoopSync 3 month revenue AUD

I tried something new with PO LoopSync. I found demand before building. I studied a forum where enthusiasts of pocket operator devices hang out and I took notes. The notes revealed patterns in what they find important. Once I figured out what they want I made some mockups and asked them if it was what they wanted. When they said yes I built it.

This worked well and sales were good right from the start. Building stuff that people already want is more fun for everyone.

I've been building in public on different projects for a while now. It basically means I tweet about what I am making. Also, this very blog! When I launch something, people such as yourself can find out about it easily and tell others. That helps a lot to get the word out. Thank you for that!

I actually don't like building in public very much. It's uncomfortable. I especially don't like social media. So I've set up a system that allows me to interact with social media in an efficient and minimal way, but still be present. Maybe I'll post about that some time.

My problem up until now has been the bang-then-crash of launches. Initial interest and sales that then trail off. I've found two good ways to fix that this year. I guess you'd call these activities "marketing".

The first one is to post maintainance updates. Every time I update an existing project I try to post something about it. This probably seems obvious to a lot of people but it wasn't obvious to me. Each time I post an update I notice a small spike in interest and/or sales.

For example there was a gamejam for roguelike games this month. Leading up to the jam I made weekly updates to Roguelike Browser Boilerplate. I posted about them on the roguelike sub-reddit forum in their "saturday sharing" section. A bunch of new sales came in! Posting updates works.

The second thing I have tried is building useful websites that link to my apps. Marketing people call this SEO. I call it "building a useful website that links to my app". Not as catchy I guess. My friend Tobias showed me how effective SEO can be. There is also a good article about SEO by jdnoc that I learned from. Some people use SEO in a deceptive way but for me it's about giving my work the best chance to be found by people who are looking for it.

The first site I built is all about pocket operators. I used the research I did earlier to make a page about the most useful stuff. This got me on the front page of search for the phrase "pocket operator apps". Lo and behold one of my apps is a pocket operator app! So people find the page and then they can find my apps too. Here's a graph of the search volume on that site:

screenshot.png

Another site I built is a free web app for generating random melodies. It's a procedural melody maker that generates midi melodies you can download. There's a link from the free web app to my other apps.

The name for this is apparently "side project marketing". So my side projects are being marketed by my side-side projects. The free web app shows up on the front page of search for a group of terms around "ai midi melody generator". One of the apps it links to is a random beat generator so the audience is very similar. Here's the graph of search volume for the melody generator:

screenshot.png

So I think these two sites are a fairly steady channel where people find my apps. They're searching for "pocket operator apps" and "melody generator" and they find those sites and then they also sometimes click through and buy my apps.

There is one other way people are finding the apps. I have a free app on the Play store. It's an app for building music apps. So people who are already interested in music apps and development can find that, and then some of them also find my other apps.

The final thing I will note is that being open source has not hurt sales. Beat Maker is open source and I haven't open sourced PO LoopSync yet. So it's an imperfect A/B test but it's still useful. There isn't really much difference in the sales. There is basically only upside to being open source for somebody at my scale. People like it, and it builds trust, and it fits my ethics.

Anyway, I hope this is useful to somebody. I'll continue to post updates about my dev adventures and things I have discovered.

]]>
/tags/workblog Wed, 24 Mar 2021 23:29 GMT
Un-template Python HTML Library entries/un-template-python-html-library https://mccormick.cx/news/entries/un-template-python-html-library Un-template is a minimal Python library to modify and render HTML. The idea is to start with a pure HTML document, choose the bits you want to modify using CSS style selectors, and then change them using declarative Python datastructures. This means you can use pure Python to replace sections of HTML and do all of the "template logic", similar to how declarative front-end frameworks work.

Install it with pip install ntpl.

Here's an example. It's a Django view function that returns a web page. A file called index.html is loaded, two HTML tags with IDs content-title and content are replaced with new content. A link tag with ID home is replaced completely with a new link tag.

from ntpl import slurp, replace, replaceWith, render

template = slurp("index.html")
faqs = markdown(slurp("FAQ.md"))

def faq(request):
    html = template
    html = replace(html, "#content-title", "FAQ")
    html = replace(html, "#content", faqs)
    html = replaceWith(html, "a#home", render(["a", {"href": "/"}, "home"]))
    return HttpResponse(html)

That's basically all there is to it. You can use it like this instead of Django's native internal templating language.

You can get the source code on GitHub or install it with pip.

It is inspired by Clojure's Enlive and Hiccup, this library uses pyhiccup and Beautiful Soup to do its work.

]]>
/tags/workblog Tue, 02 Jun 2020 11:36 GMT
The Zero Customer Heuristic entries/the-zero-customer-heuristic- https://mccormick.cx/news/entries/the-zero-customer-heuristic- When you're building some MVP or SLC it's tempting to over-think technical choices early on. It's tempting to build in all kinds of features and infrastructure.

"People might want a PDF," you think to yourself, "so I'll also build and deploy a PDF rendering server!"

"I'll need a way to measure everything my ten million customers are doing at scale so I'll deploy this elastically scaling mega analytics doodad framework server and hook it up to every user action in my app!"

cricket.gif

It's much easier to imagine new features than it is to actually build them. Before you know it there is a vapourware castle shimmering on the horizon.

I've found a useful thought pattern to combat this type of over-engineering. For every technical question just think to yourself:

What will my zero customers think of this?

When you are building the first version of something there are literally zero users. Unless it is the core function of the thing you are building, making a PDF rendering server or deploying a giga-scale analytics system is not what you should be doing. Your zero users don't care about those things! Getting even one person to actually use your thing and tell you if it's good or not is what you should be doing. Talking to people and asking them what they need is what you should be doing.

Let's protect the most scarce resource of all: our own time. At the start of the project let's make a bee-line towards shipping the best possible implementation of the core function of the software so we can find out if its useful and good.

]]>
/tags/workblog Wed, 08 Apr 2020 06:27 GMT
A Clojure-like Lisp That Runs In Bash entries/a-clojure-like-lisp-that-runs-in-bash https://mccormick.cx/news/entries/a-clojure-like-lisp-that-runs-in-bash wordmark.svg screencast.svg

Fleck is a Clojure-like LISP that runs wherever Bash is. Get it here.

This is a little experiment I hacked together from the amazing make-a-lisp project. My hard drive is littered with attempts to make this exact thing several times, and this was the first time I got it working properly.

My friend Crispin reminded me of this idea when we were discussing lightweight Clojure based scripting tools for domain specific tasks. He has made a fantastic Clojure static site generator called bootleg on top of GraalVM. We've both been inspired by the work of Michiel Borkent such as Small Clojure Intepereter and Babashka.

For more Clojure-likes check out awesome-clojure-likes.

Exciting times in the Clojure-verse!

]]>
/tags/workblog Sat, 30 Nov 2019 07:25 GMT
How I built an Excel add-in to export HTML tables entries/how-i-built-an-excel-add-in-to-export-html-tables https://mccormick.cx/news/entries/how-i-built-an-excel-add-in-to-export-html-tables An economist told me the worst part of her job is turning Excel data into HTML tables, so I built an add-in to fix it.

Many software developers probably don't realise that Microsoft Office add-ins these days are simply web pages which run inside of a panel in the UI. I suppose it was to be expected given the trend of the last couple of decades towards web based everything, but it still came as a surprise to me when I had to build one for a job. In my mind Office was still back in the world of COM objects and Delphi and DLLs.

After discovering how easy it is, I decided to do it again for this side project. Here is how I did it.

The plugin (whoops, "add-in") I built is called "Excel to HTML table" and does what it says on the tin. You make a selection of cells, click "copy", and you get a clipboard full of the corresponding HTML code that will render those cells. After that you can paste the code into your text editor and use it in a site's HTML.

Excel to HTML table screencast

Microsoft have some nodejs and yeoman projects to help you get up and running but I'm the sort of developer who likes to roll their own and keep things tight, tidy, and tiny. I like to build things from first principles so that I can understand what is going on at as low a level as possible. Here's what I discovered.

The Microsoft Tutorial has a lot of good info in it, but it's geared towards people using either Node + Yeoman or Visual Studio. If you're a text-editor-and-command-line person like me the best way to get started is just to grab the example .html, .css, and .js snippets from the Visual Studio version.

Get a dev server running

Any local HTTPS server will do. The difficult part is it must be HTTPS, even on localhost, or Word/Excel will refuse to load your add-in. My add-in is almost entirely client-side code and so I could get away with a small Python server using the built-in libraries like this:

from http.server import HTTPServer,SimpleHTTPRequestHandler
httpd = HTTPServer(address, SimpleHTTPRequestHandler)
httpd.socket = ssl.wrap_socket(httpd.socket, keyfile='selfsigned.key', certfile='selfsigned.cer', server_side=True)
httpd.serve_forever()

As you can see this looks for the .key and .cer files in the local dir, and you can create those with openssl:

openssl req -x509 -newkey rsa:4096 -keyout selfsigned.key -out selfsigned.cer -days 365 -nodes -subj "/C=US/ST=NY/O=localhost/OU=Localhost"

You can do the same thing in nodejs with express if that's your bag by passing the right options to https.createServer:

const server = https.createServer({
    key: fs.readFileSync('./selfsigned.key'),
    cert: fs.readFileSync('./selfsigned.pem')
}, app).listen(8000)

Before you can load your add-in you should make sure your browser has accepted the self-signed cert or the add-in won't load. Do this by browsing to your localhost:8000 server and bypassing the certificate warnings.

If you're debugging in the native Office app instead of Office Online, you will need to do this in Internet Explorer for it to work as far as I can tell.

Manifest validation

add-in.gif

Office finds out about the URL for your add-in using a "manifest" file. This file is XML and pretty fragile. You need to make sure it complies with the spec. Luckily Microsoft have a tool for verifying the add-in on the command line. Install the npm package "office-toolbox": "0.2.1" and then you can run a command like this:

./node_modules/.bin/office-toolbox validate -m ./path/to/manifest.xml

This will report most issues that come up.

In the manifest you can use https://localhost:8000/Home.html and it will point to your local dev server when running.

The Code

The code itself is basically web code. You can use libraries like React and jQuery in your UI. The exception of course is when calling the native APIs. These are exposed through an interface like Excel.run(...) and make heavy use of promises for async. You will often find yourself doing context.load() and then waiting for the promise to resolve before doing the next thing in the document. The API documentation is super useful for figuring out what is possible and how to do it.

Debugging

When it comes to iterating on your code and debugging, by far the easiest way is to use Office Online. This is because it is already in the web browser so debugging works the same way as you are used to - the add-in is just an iframe.

I was even able to do my dev & debugging right at home in Firefox on GNU/Linux!

At some point you will want to debug using an actual copy of Office running on Windows. I used Office 2016 on my wife's computer.

If you are not a Windows developer the following tip will save you a lot of time when it comes to debugging native. It's called "F12 Developer Tools" and it's buried deep inside a Windows subdirectory in C:\Windows\System32\F12\.

What this tool does is attach a "web console" type of debugger to your Office add-in instance running inside Office. You can do stuff like console.log and also see JavaScript errors which are thrown.

Ready to roll

8e9a6781754c869c84202121fb7a2fef.png

Once you've got those pieces in place you should have a basic add-in up and running. After that it's all about referencing the docs to figure out how to do the thing you want your add-in to do.

Submit

Finally when you're ready to ship, you submit your manifest.xml to the App Source "seller dashboard". Expect to wait a few days for the validation team to get back to you, and to have to fix things. This process was useful as they see the software with a fresh pair of eyes and give actionable feedback.

LISP madness

So finally I should mention my LISP obsession. I actually wrote all the code for this plugin in a LISP called Wisp. It's a Clojure-like that compiles to JavaScript and seems quite similar to ClojureScript but with three core differences:

  1. It lacks almost all of the great features & tooling of ClojureScript.
  2. It is closer to being "JavaScript with LISP syntax".
  3. It compiles down very small if you know what you are doing. How small? My final compiled .js bundle for this add-in is just 8.2k non-gzipped.

So that's about it. I hope you got something out of this article on building Microsoft Office plugins using web tech.

Now back to open source dev for me. :)

]]>
/tags/workblog Mon, 30 Sep 2019 12:25 GMT
How To Make Hy-lang More User-Friendly entries/how-to-make-hy-lang-more-user-friendly https://mccormick.cx/news/entries/how-to-make-hy-lang-more-user-friendly Hy(lang) is a LISP-family programming language built on top of Python. You get the rich Python language & library ecosystem, with a LISP syntax and many of the language conveniences of Clojure, such as reader macros for easy access to built in data types. What's not to love?

97bb1a75b5a52e8c6a0a8e7e68cd4b4f.png

I recently found out I have the most GitHub stars for projects written in Hy of any developer worldwide. With this admittedly ridiculous credential in hand I'd like to offer some opinions on the language.

I really like Hy a lot. I prefer writing code in Hy to writing code in Python, and I am writing this post because I want to see Hy do well. I originally wrote this as a list of things in Hy that could possibly be improved, from the perspective of an end user such as myself. Then my friend Crispin sent me this fantastic video by Adam Harvey: "What PHP learned from Python" and I realised that all of the issues I have with Hy stem from the same basic problem that Adam talks about.

Here is the crux of the problem: I've written a bunch of software in Hy over the past few years and often when I moved to a newer point-release of the language all my old code breaks. My hard drive is littered with projects which only run under a specific sub-version of Hy.

gilch[m] It might depend on the Hy version you're using.

I understand that the maintainers of the language, who are hard-working people doing a public service for which I'm deeply grateful, are concerned with making the language pure and clean and good. I understand that languages have to change to get better and they need to "move fast and break things" sometimes. That's all fine and good.

However, I think Adam Harvey's point stands. If you want users to use and love your language:

  • Break things cautiously
  • Maintain terrible things if it makes life better
  • Expand the zone of overlap [backwards compatibility between consecutive versions]

I think if you can maintain backwards compatability you should.

Almost all of the breakages I've experienced between Hy versions could have been avoided with aliasing and documentation. Not doing this backwards compatibility work basically tells your users "don't build things with this language, we don't care about you." I know that is almost certainly not the attitude of the maintainers (who are lovely, helpful people in my experience) but it is the way it comes across as an end user.

Here is a list of things which have changed between versions which blew up my Hy codebases each time I upgraded Hy:

  • Renaming true, false, and nil to the Python natives True, False, and None. These could have been aliased to maintain backwards compabitility.

  • Removing list-comp and dict-comp in favour of the excellent lfor, dfor, and friends. Again, small macros could have aliased the old versions to the new versions.

  • Removing def in favour of setv. A very small macro or maybe even an alias could have retained def as it is pretty much functionally identical from the perspective of an end user.

  • Removing apply, presumably in favour of #**. Support could have been retained for the functional apply. There are situations where a proper apply is favourable.

  • Removing the core macro let. It seems there was an issue where let would not behave correctly with generators and Python's yield statement. A more user-friendly solution than chopping the imperfect let from the default namespace and breaking everybody's code would have been to document the issue clearly for users and leave the imperfect let in. I know it is available in contrib. Moving it to contrib broke codebases.

Having your old code break each time you use a new version is frustrating. It makes it hard to justify using the language for new projects because the maintainance burden will be hy-er (sorry heh).

Some other minor nits I should mention which I think would vastly improve the language:

  • loop/recur should be core. Like let these are available in contrib, but that means you have to explicitly import them. Additionally it would be super nice if they were updated to support the vector-of-pairs declaration style of let and cond etc.

  • Why does assoc return None? This is completely unexpected. If there are performance issues then create an alternative which does what Python dict assignment does (aset?) but it seems unwise to break user expectations so fundamentally.

36583050842532df644f183cc62890e7.png

It is much easier to provide criticism than it is to write working code. I am sorry this is a blog post instead of a pull request, and I hope this criticism is seen as constructive. I want to thank everybody who has worked on hy. I am a huge fan of the language. It is an amazing piece of software and I am very grateful for your work.

]]>
/tags/workblog Thu, 26 Sep 2019 13:20 GMT
Speaking schedule 2019 and beyond entries/speaking-schedule-2019-and-beyond https://mccormick.cx/news/entries/speaking-schedule-2019-and-beyond I've got three conference talks coming up in Perth (Australia), London, and The Gold Coast (Australia). If you're nearby let me know - I would love to buy you a coffee/beer and hear what you're up to.

Security BSides

This Sunday, September 22nd, Perth, Western Australia.

I'm presenting "Bugout: practical decentralization on the modern web." It's a talk about the library I built on top of WebTorrent for building web based decentralized systems.

Bsides Perth Logo

Clojure eXchange

December 2nd-3rd, London, UK.

I'm giving a keynote: a show and tell of the multitude of strange things I've been building with the Clojure[Script] family of programming languages, and how Clojure enables the bad habit of starting way too many projects. I'll also give an update on Thumbelina, the tiny MIDI controller I've been working on with my friend Dimity.

Skills Matter Clojure eXchange

Linux.conf.au

January 13th-17th, Gold Coast, Australia.

I'm talking about Piku, and how it helps you do git push deployments to your own servers. I've made a bunch of contributions to this open source project in recent months. I've personally found a huge productivity gain from being able to deploy internet services without having to think too much, and I'm excited to show others this too.

Linux Australia Logo

]]>
/tags/workblog Wed, 18 Sep 2019 07:50 GMT
Notes on "History of the Blockchain" by Nick Szabo entries/notes-on-history-of-the-blockchain-by-nick-szabo https://mccormick.cx/news/entries/notes-on-history-of-the-blockchain-by-nick-szabo In November 2015 Nick Szabo gave a talk on the history of the blockchain which was dense with useful ideas.

61120c710d06cf0ec63488c1abd14665.png

Here are some notes I took on his talk:

  • Philosophical inspiration to Cypherpunks who invented Cryptocurrency:

    • Ayn Rand: Galt's Gulch - independence from corrupt institutions.
    • Tim May: "protect yourself with cryptography" (cyber Galt's Gulch.)
    • Friederich Hayek: Institutions of property, contracts, money are actually important to human freedom.
  • Use computer science to minimize vulnerability to strangers.

  • Non-violently enforce the good services of institutions.

  • "Try to secure as much as possible" not just communication.

  • Cryptography: only secures communications from 3rd parties.

  • David Chaum: let's apply this to money too.

  • Centralization problem remained in digital cash startups.

  • Bad assumptions in computer security: trusted third parties like certificate authorities are secure.

  • Trusted third parties are security holes.

  • Centralization is insecure.

  • E.g. Communists were able to get stranglehold with just control of railroads, newspapers, radio.

  • Gold is insecure

    • Spanish looted Aztec gold, pirates looted Spanish gold.
    • Part of end of gold standard was German U-boat threat to British gold transportation.
    • Franklin Roosevelt's government confiscated gold.
    • In modern times xray machines detect gold easily.
  • Decentralization per computer science is much more automated & secure than traditional security.

  • CS decentralization can only replace small fraction of traditional security but with very high cost savings.

  • Traditional security isn't the protocol itself, requires strong external law enforcement.

  • Computer security can be secure across national borders instead of siloed inside jurisdictions.

  • Cryptocurrency helps solve this through decentralization.

  • Separation of duties: several independent people to perform a task to get it done.

  • Each node as independent as possible.

  • E.g. crude measure of independence: geographic diversity of nodes.

  • Number of nodes is only a proxy measure of decentralization.

  • Smart contract:

    • Long lived process or "distributed app".
    • Acts like a contract.
    • Performance, verification etc.
    • Generally 2 parties + blockchain (replacing TPP).
  • Wet code = traditional law. Dry code = smart contract.

  • Law is subjective, enforced with coercion, flexible, highly evolved.

  • Smart contracts are mathematically rigorous, cryptographically enforced, rigid, very new.

  • Law is jurisdicionally siloed, and expensive to execute.

  • Smart contracts are super-national & independent and low cost.

  • Seals in clay/wax were important when writing was invented: signature + tamper evident.

  • Modern seals at e.g. crime scenes: sealing door, evidence bag with numeric identifier.

  • Blockchain can keep secure log with both semantics (serial number) and proof of evidence (photo hash).

  • Put proof of evidence on blockchain as well as semantic reference for contract code to interface with.

  • Can secure physical spaces with same mechanism.

  • Proplets: blockchain can tell them which keys have which capabilities.

    • For almost any valuable property that can be controlled digitally
    • Example: Auto-repo collateral upon contract breach.
    • Example: creditors without access to offshore oil rig used as collateral.
  • Recent project:

    • Trust minimized token: secure property titles, colored coins. Securing transfer of ownership.
    • Trust minimized cash flows (dividends, coupons, etc).
  • Idea: social networks for blockchains. Execute payment swaps & smart contracts after linking social accounts together.

  • Let's try to think about security more broadly instead of only encryption.

  • Let's try to protect everything that's important to us, without centralization.

]]>
/tags/workblog Wed, 28 Aug 2019 14:09 GMT
Joplin With Self-hosted Sync entries/joplin-with-self-hosted-sync https://mccormick.cx/news/entries/joplin-with-self-hosted-sync For some time I have been looking for a writing solution with the following properties:

  • Lets me review and make minor edits on my phone.
  • Is synched to my laptop where I can write longer form.
  • Supports simple markup such as Markdown.
  • Supports attachments and images.
  • Is Free & Open Source Software and can be self-hosted.

The solution is Joplin.

screenshot.png

It's a wonderful piece of software. There are apps for all of the usual platforms, including a direct link to the Android apk, which is a blessing if you are somebody who opts out of using Google services.

553e4b7b90a90a9132db5b59f75c84b3.png

Some other things which are great about Joplin:

  • You can edit notes in an external editor.
  • You can paste images directly into your document.
  • It imports and exports many formats including Markdown.
  • It can export individual articles to PDF.
  • Its native export format "JEX" is a simple tar file.
  • Its native data store is on-disk.
  • Sync is optional and very easy to set up.
  • Sync uses the widely supported webdav protocol.

Joplin Sync

I got sync between my devices working quickly by using Piku to deploy a simple webdav server to my Piku VPS. If you want to do this yourself, check out the latter repository and then push it to Piku as follows:

git remote add piku piku@MYSERVER.NET:webdav
git push piku master
piku config:set NGINX_SERVER_NAME=WEBDAV.SOMEDOMAIN.NET PASSWORDS="username:password username2:password2 ..." FOLDERS="/joplin:/home/piku/joplin"

After that is up and running you can configure Joplin sync by selecting "webdav" on each device and then enter the URL WEBDAV.SOMEDOMAIN.NET/joplin/ and the username:password pair you specified above.

I wrote this post with Joplin.

]]>
/tags/workblog Thu, 01 Aug 2019 03:19 GMT
BEP44 For Decentralized Applications entries/bep44-for-decentralized-applications https://mccormick.cx/news/entries/bep44-for-decentralized-applications Depiction of decentralized network

In 2014 Arvid Norberg and Steven Siloti came up with a BitTorrent extension called BEP44. The basic purpose of BEP44 is to allow people to store small pieces of information in a part of the BitTorrent network called the DHT. The DHT ("distributed hash table") is a key/value lookup table that is highly decentralized. Prior to BEP44 it was used to look up the IP addresses of peers from the hashes of the torrent they were sharing.

BEP44 introduces two new ways of storing key-value data in the DHT. The first is the ability to look up small bits of information keyed directly on their hash. The second allows for the storage of cryptographically authenticated data, keyed on the public key. What this means is if you have some public key K then you can look up authenticated blobs of data stored in the DHT by the owner of that key. This opens up a variety of useful abilities to people building decentralized applications.

For instance when you distribute a file on BitTorrent it is immutable - you can't change it - but BEP44 provides a way to tell people "hey, there is a new version of this file that I shared before which you can find over here" in a secure and authenticated way. It turns out that this basic mechanism can be used to build a wide variety of decentralized, authenticated functionality and people have built cool experiements like a decentralized microblog using it.

The purpose of this post is to show you how the datastructure works and how to apply it more widely in your own software.

If you just want a working implementation you can use, check out the decentral-utils JavaScript library which has a single-function implementation of the datastructure and algorithm which you can use in the browser. Web browsers unfortunately do not have direct access to the BitTorrent DHT, but the library is still useful for authenticating up-to-date data blobs that are passed around between browsers, for example over WebRTC.

What's good about BEP44

Most of the advantage of BEP44 comes from the cryptographic signing. What this offers is a way for somebody (let's call her Alice) to verify that a piece of data from somebody else (let's call him Bob) is authentic and that it has not been tampered with. You can do cryptographic signing without BEP44 of course.

However, the BEP44 specification brings some other features for building decentralized systems:

  • Namespacing against a public key.
  • Replay attack prevention.
  • A compare-and-swap primitive.

The namespacing feature means that a single public key can have a bunch of different data blobs that others can authenticate. Because the datastructure is keyed on both the public key and a "salt" field, a single public key can store multiple blobs with different "salt" values as a sub-key.

Replay attack prevention is accomplished with the sequence field, which in BEP44 is a monotonically increasing integer. The way a replay attack works is some adversary keeps an old copy of something you have signed and then when you are offline they send it again and pretend its a new message. Because the message is signed with your key people are fooled by the adversary into thinking the old data is current. In BEP44 the sequence field prevents this because the adversary will have a data blob with a lower sequence number than the last one you shared and they can't generate a new blob with a higher sequence number because they do not have your private key. So the replayed data blob will be rejected.

Finally, the compare-and-swap primitive works by specifying a previous sequence number that the current data blob should replace. Peers will reject data blobs which don't replace the current sequence number they hold. In this way it's possible to acheive basic synchronisation and atomicity of data blobs. Using compare-and-swap guarantees that the new value you want to insert into the DHT can be calculated based on up-to-date information.

Example usage

Imagine you take upon yourself the small task of building a decentralized social network. Users have a timeline of posts which they have written. When they write a new post it should be appended to their timeline and people should see the updated timeline with the new post.

In the centralized social network case this is easy. The social network provider TwitFace has an internal copy of the poster's timeline to which they add the new post. Followers are notified of the update and the updated timeline is sent to them by the provider. The way the authentication works in this case is the original poster has logged in with their password and so the provider knows who they are, but the receivers of the update must trust the provider.

Depiction of centralized post authentication

How do readers know that the post is authentic and comes from the original poster? The reader must trust the provider completely. They must trust that the provider is showing them the authentic timeline of messages, that the messages have not been modified, that fake messages have not been injected into the timeline, that new messages have been added to the feed in a timely fashion, and that no message has been censored and removed from the feed by the provider.

The problem with this model is that trusted third parties are security holes. Centralized social networks do modify things people have said. They do inject posts into people's timelines (ads and worse). They do change the order and timing of posts to suit their own goals. Perhaps worst of all, they do censor posts completely.

I won't get into the politics of censorship. Suffice it to say that censorship is fine and good right up until the point where your values differ from the entity doing the censoring. Values change, mysterious outside influences are myriad, and few people have complete alignment of values even with our noble corporate overlords in Silicon Valley.

The basic goal here is that if you've chosen to read somebody's timeline you can trust that the feed of posts you are reading from them is authentic and complete and as they intended.

Happily, we can use the BEP44 technique to route around the trusted-third-party centralized-social-network-provider security hole. BEP44 provides a way to receive an update and verify it cryptographically. It provides a way to know that this is the latest and complete version. It allows the verifier to do this using only the received data structure without requiring any kind of secret server side logic or password based authentiction like you would find in a centralized social network.

Depiction of decentralized post authentication

How it works

At the heart of the algorithm is a small datastructure which can be shared, updated by the sharer, and then shared again. Receivers can verify the updates are authentic. The idea is that you can share a small piece of authenticated data which points to a larger piece of immutable data. For instance, you can share an authenticated datastructure with the hash of a torrent containing all of your posts (an RSS feed for instance). Receivers then know that torrent is the current representation of your timeline of posts.

The datastructure has the following fields:

  • value
  • seq
  • cas [optiona]
  • salt [optional]
  • pubkey
  • signature

When a verifier receives a copy of this data structure they perform the following checks:

  • Is the size of the value field below the maximum size?
  • Does the value conform to the expected format / encoding?
  • Is the seq number higher than the last seq number I saw (if any)?
  • If cas is present does it match the previous seq number I have?
  • Is the attached signature made with the private key corresponding to the attached pubkey over the datastructure's value, seq, and salt fields?

In this final point they are checking that the structure has been digitally signed with the author's private key. This is accomplished by concatenating salt + seq + value and then checking that the signature is valid for that data and the given public key.

In this way verifiers can know that an update from a known keypair/identity is authentic and current.

Read more posts on the subject of cryptography & decentralized systems.

]]>
/tags/workblog Sun, 12 May 2019 03:28 GMT
Build a decentralized web chat in 15 minutes entries/build-decentralized-web-chat-15-mins https://mccormick.cx/news/entries/build-decentralized-web-chat-15-mins This post originally appeared on the David Walsh blog.

In this 15 minute tutorial we're going to build a simple decentralized chat application which runs entirely in a web browser.

All you will need is a text editor, a web browser, and a basic knowledge of how to save HTML files and open them in the browser.

Diagram of how WebRTC works browser to browser

We're going to use Bugout, a JavaScript library that takes care of the peer-to-peer networking and cryptography.

Get the full tutorial on GitHub

Read more posts on the subject of cryptography & decentralized systems.

]]>
/tags/workblog Wed, 27 Mar 2019 13:13 GMT
Inkscape Animation with SVG Animation Assistant entries/inkscape-animation-with-svg-animation-assistant https://mccormick.cx/news/entries/inkscape-animation-with-svg-animation-assistant Update! SVG Flipbook is an app doing flipbook-style layer animation with Inkscape.

SVG Animation Assistant is an open source companion application for Inkscape.

SVG Animation Assistant interface showing Inkscape and a walk cycle animation

It runs along side Inkscape and helps you animate by cycling through the layers of your SVG as you edit it.

This allows you to do basic flip-book style animation. Each layer in your SVG is one frame of the animation.

Customise frame timing and behaviour by editing the layer name:

Inkscape layers UI with customisation

  • Set the number of milliseconds to pause on each frame by entering a number in brackets in the layer name like (100) for a pause of 1/10th of a second.
  • Add static background frames by putting (static) in the layer name.

The animation live-reloads in the assistant window whenever you hit save in Inkscape.

SVG Animation Assistant interface showing live reloading

Run on Linux

If you are a Linux user you can use the online version of this app.

You can get the full source code to this application on GitHub.

]]>
/tags/workblog Tue, 20 Nov 2018 11:53 GMT
Zero Feature Software entries/zero-feature-software https://mccormick.cx/news/entries/zero-feature-software Let's build software like an axe.

An axe in a block of wood

feature

n.

A prominent or distinctive aspect, quality, or characteristic: a feature of one's personality; a feature of the landscape.

Properties of an axe:

  • Does not have "features".
  • Useful to ordinary people.
  • Does only one thing & does it well.
  • Simple to obtain, use, and maintain.
  • Does not have "upgrades".
  • Works on a platform that nearly everybody has.
  • Is the property of the owner.

For the task of chopping wood the axe is a perfect technology.

An axe is made to do one thing and do it well. It has no features.

An axe is finished. You buy the axe from the store and it chops wood.

The axe design changes only over millenia: slowly and carefully. The axe itself does not change.

An axe does not receive upgrades. Nobody wants "continuous delivery" for their axe. The axe is immutable.

Software axes

Let's build software like an axe.

  • Let's remove all the "features".
  • Let's do one thing and do it well.
  • Let's build things simple enough to secure.
  • Let's finish the software before we release it.
  • Let's release immutable, self-contained artifacts.

By all means, experiment. When it comes time to release let's be disciplined and craft something secure, wholesome, and complete.

Examples of software axes

Hand holding an axe

]]>
/tags/workblog Mon, 12 Nov 2018 02:36 GMT
Hashcash Auctions for Decentralized Resource Allocation entries/hashcash-auctions-for-decentralized-resource-allocation https://mccormick.cx/news/entries/hashcash-auctions-for-decentralized-resource-allocation Abstract

Hashcash is a mechanism for defending against spam and denial-of-service attacks in email and other decentralized systems. Implementors of systems using hashcash face the issue of how to set the proof-of-work difficulty. A general solution to this problem is given which can be used to predictably allocate resources in decentralized systems where individual nodes each contribute finite resources. Some emergent effects of this solution are explored.

Hashcash client-puzzle token represented as a dollar bill

Hashcash

Nodes in decentralized systems face two closely related issues around the use of their resources. The first problem is that of how to fairly allocate resourcess between connecting clients, and the second problem is that of how to protect themselves against resource depletion attacks. Some systems, like Bittorrent, largely sidestep these issues because the goals of participants (download the data quickly) are aligned. In other systems such as email, nodes are vulnerable to resource depletion attacks like spam and denial-of-service (DoS) attacks.

In 1997 Adam Back proposed an ingenious & practical scheme for countering DoS attacks and spam emails called hashcash. This scheme was famously referenced by, and used with some tweaks, in the software which launched the cryptocurrency revolution: Bitcoin.

As noted in the 2002 Hashcash paper the idea of using a proof-of-work in this way predates Hashcash in the work of Cynthia Dwork & Moni Naor, 1993 and later in the client-puzzles work of Ari Juels & John Brainard, 1999.

Adam Back's implementation is notable because it was based on a modern widely implemented cryptographic hash function, used a cleverly simple leading-zeroes hash collision mechanism which is verifiable with the naked eye, and most importantly of all it came with concise source code which anybody could compile and run ("cypherpunks write code").

The common idea behind these proof-of-work schemes is that you do not allocate resources to the processing or storage of incoming messages unless the sender has proven that they have done a certain amount of computational work. A spammer or DoS attacker is forced to multiply the work required by the number of attacking nodes they are utilising making it more expensive for them to attack. The key to the proof-of-work mechanism is that it is easier for the receiver to verify the proof than it is for the sender to construct the proof, and secondly, that the receiver can specify how much work the sender needs to perform in order to be considered. In Hashcash the construction of proofs-of-work and their more efficient verification is accomplished with a basic building block of cryptographic systems, a computational trapdoor: the hash function. The second key to proof-of-work, that of the receiver specifying the difficulty to require of clients, is not clearly addressed in this body of work. Here we'll describe a general solution to the problem of setting PoW difficulty and explore some emergent effects of the solution on decentralized systems.

Drawing of a pick and shovel

The difficulty of calibrating hashcash difficulty

In Dwork & Naor (1993) they say:

There is a pricing authority who controls the selection of the pricing function and the setting of usage fees.

And later, in the section on the Fiat-Shamir signature scheme:

A reasonable choice is k = 10.

Where k here is equivalent to what we now call the difficulty in cryptocurrency proof-of-work contexts. Basically "here's a reasonable value to use on today's computers". This strategy is not tenable in many decentralized systems because a pricing authority, or "trusted third party" is an inherent security risk (Szabo, 2001).

In Adam Back's 2002 paper there is a section called "Dynamic Throttling":

With interactive hashcash it becomes possible to dynamically adjust the work factor required for the client based on server CPU load.

But it's not clear exactly how much difficulty should be assigned in proportion to the server CPU load.

Futher, on the "Bitcoin" page of hashcash.org:

Hashcash difficulty is static and eroded by Moore's law currently 20 bits.

Again we see a "reasonable value to use on today's computers".

In "How bitcoin uses hashcash":

Hashcash was expected to adjust difficulty every 6 months or year, manually based on observed moore's law estimates. It was proposed that this be measured against a benchmark machine by some vague/undefined human consensus process and broadcast via signed difficulty update headers, and/or long TTL DNS TXT records.

He further points out that Bitcoin's "automatic inflation control" innovation has solved this problem for the cryptocurrency case.

In the 2002 paper again we see the following in the section on mitigation of TCP connection-slot depletion, which concludes with:

Connections will be handed out to users collectively in rough proportion to their CPU resources, and so fairness is CPU resource based (presuming each user is trying to open as many connections as he can) so the result will be biased in favor of clients with fast processors as they can compute more interactive-hashcash challenge-response tokens per second.

The implementation detail of exactly how much difficulty to require per connection slot is again left as an exercise to the reader. Adam Back's original hashcash announcement on the Cypherpunks mailing list makes reference to a fixed difficulty of the first 20 bits of zeroes and as noted in the Wikipedia page on hashcash this becomes a sort of de-facto standard reference value.

Drawing of a flyer with tear-away tabs

Existing solutions

How high to set proof-of-work difficulty is of course not completely vague and unsolved. Bitcoin famously pits miners against each other using a hashcash difficulty that is adjusted every two weeks to a value computed from previously achieved hash difficulties. Referring back to the TCP connection slot depletion example in the 2002 paper, we can think of the Bitcoin miner hashcash use-case to be a slot of size 1 that is only available to be filled every 10 minutes. Hash rates are forced upwards by competition for that single slot between miners who will be rewarded under the incentive model of the Bitcoin system.

There is another mechanism in Bitcoin that we can also borrow from for decentralized systems (absent a fully functional and carefully engineered global store-of-value system called a Blockchain); the fixed block size. In Bitcoin the size of each block storing transaction data, and therefore ultimately the number of transactions, is fixed. This was a point of contention for some time amongst Bitcoin enthusiasts and stakeholders but in the end those supporting a fixed block size had their day and Bitcoin today retains a fixed block size into which transactions from the mempool must be squeezed.

Every Bitcoin transaction is accompanied by a fee paid to the miner who mines the block in which it is included. To decide which transactions should be accepted into the block they are minting miners naturally refer to these fees out of financial self-interest, to ensure they capture the highest aggregate fee possible. When a new block is minted a miner will sort the transactions from those with the highest fee paid to those with the lowest fee paid. Then they will insert into the block the top-N-by-fee transactions that will fit into the fixed block size.

So we have three models to build our solution on: the TCP connection slot example, Bitcoin's automatic inflation control, and Bitcoin's fixed block size.

Consider that in all three of these examples the fundamental resource being allocated is subject to fixed constraint, whether it is TCP slots, Bitcoin issued per 10 minutes, or the transaction-processing capacity of Bitcoin nodes. It is in fact always the case that there are resource limits placed on networked services even if that limit is sometimes so high as to give the illusion of being unlimited. Take for example the original use-case of hashcash, the email inbox: here the limited resource is your own time and attention. There is some practical upper bound on that resource such that you would be fundamentally unable to check your email if you were receiving e.g. 10,000 emails per hour. This is also true of the resources of nodes in any decentralized system.

We can use the fixed block size model in combination with hashcash to balance load and distribute resources efficiently in decentralised systems with automatic difficulty adjustment and without requiring a full cryptocurrency blockchain implementation or trusted central authority.

Drawing of an auctioneer's gavel

Fixed resource hashcash calibration

The "fixed block size" mechanism can be used more generally in decentralized systems to protect nodes from resource depletion and deterministically allocate resources by conducting a "hashcash auction" between clients over the available resources.

The hashcash auction works as follows:

  • Each node specifies the number of clients they can comfortably serve as R, given their resource limits on disk, bandwidth, CPU, etc. This can often be computed based on simple measurement of the resource(s) in question.
  • Nodes issue HMAC'ed tokens ("symmetric key certificates" in the hashcash paper) to clients who request access which include a timestamp so that they can be expired.
  • To use a node's resources, clients must bid for a slot by performing a PoW over these tokens.
  • Nodes publish the upper, lower, and median PoW currently acheived by clients occupying the available resource slots R.
  • Connecting clients are added to the client pool, which is sorted by PoW difficulty-achieved, and only the top R by difficulty are serviced.
  • (Optionally, the lower-bound minimum PoW can be capped, to emulate the same basic DoS protection that hashcash provides.)
  • (Optionally, dependent on use-case, clients are disconnected after some time and required to re-bid for a slot).

In short, you decide how many clients you can service at a time (R), then sort connecting clients by their PoW height, and then deny access to, or disconnect clients who fall outside the top R.

Node awaiting client connections

Note that the server is not required to store any information relating to the the tokens it gives out to prospective clients because it can use the HMAC to verify the tokens it has issued and the time at which they were issued. In this way the issuance of tokens and their expiry becomes very cheap for a service node.

Client bids for a service slot

The natural consequence of a system like this is that clients must compete for available slots to use the node's resources, and a node does not need to specify exactly how much PoW each client must perform. Instead clients arrive at this value through an emergent bottom-up bidding process. The node no longer has to guess at "20 bits" and can instead honestly say "look, here is how hard other clients are working, if you want to access my resources you'll have to do better".

Client 4 out-bids

The clients themselves determine the proof-of-work difficulty setting in an ongoing auction where they must make a higher bid than other clients in order to be allowed access to consume the fixed set of resources, or go elsewhere.

To cope with large numbers of clients and provide a higher level of granularity in the PoW ranking, a simple less-than operator can be used so that clients can bid for intermediate values. For example 0x01fe < 0x01ff so a client bidding with a hash of 0x01fe would beat a client bidding with a hash of 0x01ff.

Properties of this system

What this restricted-resources scheme gives us is a way to measure the sacrifice (Nick Szabo, 2002,2005) that a client is willing to make for the privilege of using a node's resources. As with the human measure of time sacrificed in Szabo's article, it is not an ideal way to measure the importance of incoming client connections, but it is better than existing systems which are essentially a lottery, or in some cases "let the bad guy win". It is not perfect but rather a "proxy measure" which gives a service a deterministic ranking system of roughly those clients who most badly want or need access.

Drawing of an hourglass

At first this system seems unfair - those clients with the most CPU are able to dominate access to resources - but notice that it is much fairer in the worst case where a motivated legitimate client can still get access under DoS or spam attack by working hard to out-compete just a single instance of the attacker. Given enough motivated legitimate users there now also exists a way for them to collectively out-compete a spam or DoS attack. In other words, utilising this scheme has a better failure mode than the free-for-all which normally prevails in networked systems.

Another way of looking at this system is that we have inverted the security aims from "keep out the bad guys" to "keep in the good guys". By giving motivated users a way to express the strength of their need we are able to keep those clients connected who signal the most need to access to the resources. We are able to preference those users who will actually utilise the resource with a higher probability than those users who will waste it, because wasting it is now costly.

A further benefit is that the service utilising this scheme is protected from hard failure. In the worst case failure mode an attacker is only able to deplete resources up to the comfortable hard-limit R specified by the server, meaning the server is never placed under such high load that it entirely crashes and burns, so long as R is specified within serviceable bounds.

In the real world there is always a constraint upon resources, and it is often the case that those wanting to consume a resource must hustle harder to get access. This scheme has an honesty that mirrors the reality of constrained resources rather than pretending that resources are infinite, which is what many historically optimistic decentralized systems do to their detriment. A computer cannot consume more disk space than 100% of disk, and it cannot utilise more CPU time than 100% of CPU time. Modern network services are often deployed without acknowledging these real constraints and with vague hand-waving about "scaling" by adding more machines to accommodate load. That's fine for Silicon Valley backed ventures with money to burn but decentralized systems are often run by volunteers contributing their precious personal resources. Silicon Valley assumptions on volunteer run networks are likely to lead to ruin.

Finally, this scheme has additional emergent benefits in the case where clients have a choice of decentralized service providers. This case is common in decentralized systems where some subset of participants run their own "full nodes" which other clients may then utilize.

Bottom-up load balancing

The further interesting consequence of the hashcash auction scheme arises when we have many service nodes providing the same decentralized service. In the auction-determined proof-of-work difficulty setting, clients now have a clear signal about which nodes are under most heavy use and can freely choose those nodes where the PoW, and hence the resource usage, is lower. Given a choice between node X which requires a very high PoW and node Y which requires a lower PoW, a client will naturally try to connect to node Y first.

Bottom-up load balancing

The result of clients seeking lower-difficulty service nodes is that they spread themselves more evenly across the available servers. This less clustered network graph can lead to a healthier more decentralized system without requiring any kind of central coordination to acheive it. This emergent effect is a sort of bottom-up load balancing system.

Expiry and time preference

Decentralized systems differ in the way in which resources are allocated over time. Sometimes the resource provided by nodes is small and fast, such as UDP packets in Bittorrent's DHT. Other times the resource may be longer lived and more expensive on a per-unit basis, for example disk space storage or a regularly run cron job.

In the situation where resources are long lived and expensive to maintain it makes sense to require clients to periodically re-bid for their access to the resources in question. For example, rather than storing a connecting client's data on disk indefinitely the service node could require that a client re-bids for the disk space once a month or once a year.

A variant on this would be to use a time-based weighting factor on the PoW submitted by connecting clients such that the PoW of more recent connections are weighted higher during bid sorting than older connections. For example if you wanted to preference more recent connections instead of sorting on PoW you could sort on some variant of PoW / (t + 1) where t is the time since connection.

Graph of PoW over t plus 1

In the parlance of economics this introduces a time preference into the system.

Example

Let's close with an example.

Imagine a decentralized social network where users are able to make posts and collect and publish replies to those posts. This can be accomplished in a decentralized way by representing users as key pairs and having every post and every reply signed by its owner's key pair to demonstrate authenticity rather than relying on a centralized trusted third party for authentication. While a user is online they can host their own set of cryptographically signed posts for others to download via a peer-to-peer network, and collect replies to those posts in the same way. However, what happens when the user goes offline, turns off their devices, goes to sleep etc.?

One solution is to have "followers" of the user automatically mirror their posts for other users to download while they are offline. This solution is better than nothing but falls short when a user has few followers or only has followers in the same timezone such that they will all be offline around the same time. Segments of the social network would go dark for periods of time leading to a lack of reliability of access and a form of accidental time-based self-censorship. Some audiences may be satisfied with such a constraint as a tradeoff for a higher degree of privacy and overall censorship resistance, but a mainstream audience is unlikely to accept this lack of reliability.

A more robust solution would be to back this up with a set of volunteer run "full nodes" which users could enlist to mirror data and collect replies in their absence. This architecture immediately begs a number of questions around allocation of the "full node" contributed resources of disk space and bandwidth. How do nodes protect themselves against greedy clients taking all bandwidth and disk space? How do clients protect against spammy replies? How do nodes protect against censorship via resource exhaustion and DoS?

Using the "hashcash auction" we can get a reasonable solution to these problems, bringing higher availability to the system. Those nodes which provide the service of replicating user posts would measure how many clients they are willing and able to serve based on system resources. They can each do this by measuring their disk space for example and allocating a fixed amount of disk per user to a fixed number of users R keeping this constraint within the disk space each has available. Each full node then requires connecting users to bid for a storage slot for the hosting of their posts while offline. Only those clients who are able to demonstrate their need the most strongly will be able to use the offline-replication service of a node. When new nodes are deployed clients will naturally tend towards them as they choose the lower PoW requirement relative to nodes which are serving many existing connections and therefore have higher PoW requirements.

Those users who might run a full node, and others interested in the health of the network, also have a clear signal as to the current load on the whole system and the relative need for more nodes by looking at the PoW they require in aggregate. People who might run a voluneer node are also provided an assurance that there is a fixed cost on the resource they will be donating, making it more likely that people will be willing to run full nodes to support the system. Finally, there is an incentive to running a full node as the user could white-list the cryptographic keys of their own devices and those of their friends and family, meaning that those accounts are always given preference over the accounts of strangers, bringing benefit to both sets of users.

Drawing of Wampum beads

Addenda

  1. In the spirit of "cypherpunks code" I recently published scrypt-hashcash which provides a basis upon which to validate some of these ideas on the web. For example browser based clients could bid for resources by performing a hashcash PoW using this library and submit it via HTTP, WebSocket, or WebRTC connection to the service conducting the resource auction.

  2. It is interesting that Hal Finney touched on a related idea in RPOW enabled BitTorrent and in true cypherpunk tradition offered code implementing his ideas:

Imagine a version of BitTorrent where nodes that have completed their downloading (called "seeders") could earn RPOWs by continuing to upload. And what could they do with those RPOWs? Imagine further that this version of BitTorrent allowed nodes to send RPOWs in order to get higher priority from other nodes which are downloading. Just such an experimental, RPOW-enhanced version of BitTorrent is now available for download.

Read more posts on the subject of cryptography & decentralized systems.

]]>
/tags/workblog Sat, 27 Oct 2018 07:11 GMT
Solderless Prototyping entries/solderless-prototyping https://mccormick.cx/news/entries/solderless-prototyping DSC_2597.JPG

]]>
/tags/workblog Wed, 03 Oct 2018 13:56 GMT
Decentralized Identity Linking entries/decentralized-identity-linking https://mccormick.cx/news/entries/decentralized-identity-linking Decentralized identity linking animation

A problem faced by decentralized systems is that of naming things. The problem is best expressed by Zooko's Triangle which conjectured that no single kind of naming system can provide names satisfying all three of the following properties:

  • Human-meaningful: Meaningful and memorable (low-entropy) names are provided to the users.
  • Secure: The amount of damage a malicious entity can inflict on the system should be as low as possible.
  • Decentralized: Names correctly resolve to their respective entities without the use of a central authority or service.

Something as simple as a user handle or nickname becomes a complex issue in a decentralized system. What you ultimately want is something like a decentralized Twitter handle. The handle is a single human-memorable string which points to one and only one user in the system. The first user to register a particular name gets to claim that name. Obviously the case of Twitter handles is not decentralized as it requires the Twitter database as a single point of authority on who owns which name.

As the Wikipedia article points out, and Zooko has acknowledged, the invention of the Blockchain allows for solutions under certain conditions. However the effort required to design, build, and deploy a Blockchain along side your distributed application are quite steep. If you want to use an existing Blockchain implementation then either your users must be invested in it or your app must invest in it and pay for name registrations.

The former situation requires your users to not just install your app but also to acquire some particular Blockchain currency which steepens the barrier to entry for your app. The latter alternative requires your app to fork out for registrations which means the naming system of your decentralized app will fail if you or whichever entity is paying for the names shuts down.

The Keybase Method

In a decentralized system "one user" often means "one or more cryptographic key pairs". The site Keybase offers a good alternative naming system in the situation where "user" means "keypair". The solution is to use existing online identities held at different providers and link them cryptographically to the user's key(s). This means that third parties can verify an identity as follows:

@zooko on Twitter has public key B123h9EAgVdnXWjxUa3N9Amg1nmMeG2u7.

Or to put it more rigorously:

A person who has control of the private key corresponding to public key X also has write access to online service at Y.

A lot of the time when we want a name for something, the latter is what we actually want. We want to know that the existing relationship and trust I have for @zooko on Twitter can be carried over into this new decentralized system I have started using. If I have multiple existing relationships on different services with @zooko on Twitter then I can also verify those and in the decentralized app I can give @zooko the simple name "zooko" (which can often be inferred from usernames on services) and refer to the person with that handle from that point forward.

The way that Keybase does this is to help the user to sign a statement with their signing key saying "I am @name on service X" and then post the text and resulting signature on the service itself and keep a record of where somebody can find the post. If you do this it means you can prove to people that your signing key is associated with some existing internet identity such as your Twitter account. People who want to remain anonymous or pseudonymous with respect to their existing internet identities can simply skip this linking step.

This technique can actually be used to link any internet location that the user has write-access to back to a public key that they have the private key for. You simply sign a statement saying "I have write access to location X." and post it at "location X". Some examples:

  • I have write access to https://mccormick.cx/ -> post signed message as textfile there.
  • I have write access to https://twitter.com/mccrmx -> post signed message as tweet there.
  • I have write access to https://github.com/chr15m -> post signed message as gist there.

Links to the signed post are kept and shared with other users who wish to verify the existing relationship.

A hand drawn lock

The decentralized "Keybase method"

Keybase is a centralized website and system which means that it has convenient integrations and is easy to use, but it relies on their infrastructure and centralized, proprietary software. The good news is it's possible to use the exact same technique that they use in a decentralized system. Doing this gives a user the power to make statements like:

@zooko on Twitter is me (B123h9EAgVdnXWjxUa3N9Amg1nmMeG2u7) on this decentralized service.

And then other people can confidently start using the name "zooko:twitter" in the place of B123h9EAgVdnXWjxUa3N9Amg1nmMeG2u7 in the decentralized system. In fact if the person has signed proof-of-write-access on multiple services and they have the same username on those services we might as well just start calling them "zooko" instead of "zooko:twitter+github". If two zooko's come along then we can just refer to them by their specific service-postfixed names, or some other qualifier, so that we can differentiate when interacting with them - see below for more on solving this "two Johns" problem.

What this means is that "zooko" in the decentralized system becomes a short-hand way of referring to an entity with whom we interact on a constellation of sites. In some ways this is more robust than naming systems on centralized systems (where anybody can "steal" a username by registering it before somebody else) because the alias actually represents the relationship rather than an entry in one specific proprietary database.

So how can we do the same thing in a decentralized system? Quite easily:

For each service the user is on:

  • They sign a statement saying they have write-access to their account at that proprietary service with the signing key that they use on the decentralized service.
  • They then post the signed statement to the proprietary service.
  • When users are connected for the first time each user shares their links to "proof of linked identities" (if any) so that they can verify any existing Internet identities.

No-namers & the contacts list model

What about in the case where somebody does not sign anything to say that they own some existing online aliases? What if your Mum starts using decentralized service X and she does not have a GitHub or Twitter account? What if somebody chooses not to link their existing identities and instead decides to start afresh?

In the real world we don't really get to name ourselves. We can suggest a name to people, but they get to call us whatever they want. This is how nicknames happen and it's a bit of an aberration that in the online world we get to choose one canonical name by which people call us on proprietary services.

What if naming in online systems worked more like the decentralized real world? What if we suggested a name by which somebody could call us, but it was up to them to use it or to choose a different name by which to call us? What if, in this new decentralized service that my Mum has started using, I simply refer to her as "mum"?

If somebody has no existing online identities they can simply suggest a name when two users first connect and trade keypairs. The receiving party can then choose to adopt the suggested handle for that user, or if there is some conflict in their existing set of collected aliases, add a qualifier, or choose an entirely different name that makes sense to them.

This is the "contacts list" model of naming. You can tell me your name but I don't have to use it for your entry in my contacts list. I can adopt the name you've suggested but if I know two "Johns" then I'm likely to add some other qualifier like "Tennis John" and "Football John".

Aside: it would be funny if even DNS worked like this. The first time you loaded Google it would suggest to you "please call me google.com" but you could choose whatever name you wanted to type into the URL bar from that point forward. Norms would form between humans using the internet for what different sites would be called, and sometimes they would be called something different than what the person owning the domain wanted.

I don't think I would use "google.com" for example. I can think of some better names for that site.

Summary

The system proposed here for naming users in decentralized way uses an amalgam of two models: Keybase-style service signing, and the contacts list model.

Detailed view of decentralized identity linking contact card

To summarize the process of managing identities in this system:

  • Each user in the decentralized app is identified by one or more cryptographic key pairs.
  • A user may set a suggested handle they'd like other people to use.
  • Optionally a user is able to add existing identities (Twitter, GitHub).
  • Optionally the app helps the user create cryptographically signed back-links from their existing internet identities to the keypair.
  • The app stores the user's links to their signed posts if any.
  • When user A sees some user B for the first time the client receives the following information about user B:
  • Their public key or fingerprint.
  • Their suggested handle.
  • URLs of their cryptographically signed identity-linking posts.
  • The app then fetches user B's identity-link posts and verifies the signatures found there.
  • Optionally user A is able to modify or supply their own local name (handle) for user B.
  • The app creates a contact entry for user B with this information.
  • When displaying user B the application uses the handle (chosen by user A) stored in the contact card.
  • Under the hood the application uses user B's public key or cryptographic fingerprint/address to refer to that user.
  • In the user interface the contact card is visible appropriately: for example on hover of user B's handle.
  • The displayed card shows user B's handle, identity-links with verification status, and public key or fingerprint.

The raw public key or fingerprint(s) can additionally be used to verify real world identity out-of-band such as at key signing parties, personal meeting, or over a known secure channel.

Read more posts on the subject of cryptography.

]]>
/tags/workblog Fri, 31 Aug 2018 01:46 GMT
On Self-hosting and Decentralized Software entries/on-self-hosting-and-decentralized-software https://mccormick.cx/news/entries/on-self-hosting-and-decentralized-software Sketch of a tree

Web applications often follow a client-server model meaning that there is a piece of software which runs in your web browser (the client) and a piece of software which runs on a server somewhere. I'm interested in this model, where it came from, and where it's going, and I discuss this below. I'll also discuss a new model for self-hosting web applications that I've been exploring.

tl;dr

  • To "self-host" a web application means to run the client and server on computers you own.
  • Self-hosting is a basic foundation of decentralization.
  • The "installable app" model on phones & PCs is quite decentralized.
  • Decentralization is robust and anti-fragile.
  • How can we bring self-hosting of web apps to everyone?

Brief history of software applications

In the old days most software people ran was in the form of applications that you would download onto your PC and run. This was quite a decentralised model in that you could find whatever software you liked and run it on your machine without asking permission from anybody.

The modern mobile-device app stores and the web have deviated from this model. In the case of app stores you can only download software that is sanctioned by the company who runs the app store. On Android it is quite easy to change the settings on your device to install software applications from any source but this is not the default, and on Apple devices it's very difficult to do.

The web has a more complicated approach. Running software in your web browser is generally as easy as visiting a URL and people are free to run whatever websites and "web apps" they like on their devices. As mentioned above however, the server-side portion of the web application you are using is often running on somebody else's server computer and not in your control.

Client-server model on the web

Over time more and more functionality has moved from server side code into the web browser. Applications like Gmail pioneered this transition. Recent browser technologies like WebRTC and service workers have given a stronger base to build client-only applications on. The "add to homescreen" paradigm, whilst relatively obscure, presents a vector for running web applications on your computers and devices in a similar way to installed applications.

The buzz-word "serverless" captures the zeitgeist: moving more and more of the software off the server and into the web browser client. The reason software developers are generally in favour of this move is because it is easier to deploy code to web browsers than it is to deploy code to servers; it's easier to get users to run that software (just send them a link); and there is less of a maintenance burden if you don't have to keep a server running.

There's another strong reason why client side code is more favourable than server side code and that is because most server code is running on servers you don't own. For example when you run Gmail the server code and all of your emails are completely controlled by a third party and trusted third parties are security holes.

Even "serverless" in the context of web apps just means that the server component requires less infrastructure and is easier for developers to deploy. The code is still running on a machine owned by some third party who is neither the user or the developer.

Whilst this move towards client-side web applications is good in terms of user control, a problem even with "serverless" web applications that run in your browser is that the software can be modified without your consent. When you load up a web application tomorrow the developer might have made a new version with features you don't like and your browser will load the new version automatically even if you wanted the old version. Even worse, because you load a fresh copy of the software each time you use it, a man-in-the-middle attacker might sneak a malicious bit of code into the version you load tomorrow that is not there today.

There are also many use-cases where only a piece of server-side software will do. Generally these use-cases fall under the umbrella of asynchronous communication, such as storing a file for later retrieval, sending a message to one or more people which might only be received when you are offline, syncing state between devices which might be connected at different times etc.

Examples of things the server code will typically take care of:

  • Connecting client apps together in real-time.
  • Data manipulation functions.
  • Proxying network requests.
  • Persistent data storage.
  • Running periodic jobs.
  • Access control.

So the world is trending towards a more fragile situation of centralized control of the software we all rely on. It would be better for humanity if we were instead moving towards decentralized models of software where an individual can run and control the software they want without fear of a fragile centralized system failing or being used against them by powerful interests and hackers.

Sketch of a tree

Self-hosting

So these latter problems are some of the reasons why self-hosting web applications is still a good move in terms of security and control. When self-hosting, all of the software and all of the data is running on your own computers and you don't have to trust any third party to not be evil, corrupt, make mistakes etc. You can choose which software to install and run and when without obstruction. You get the benefits of cloud connected software - shared state between devices and other users - without having to trust a third party.

The main problem with the self-hosting approach is that it is very technical and therefore simply not an option for most users. Some of the barriers to people self-hosting their web software stacks:

  • Renting or building a server.
  • Installation of server side code.
  • SSL certificate management.
  • Server maintenance.
  • Access control management.
  • Securing the server.

These are the types of things that systems administrators are paid a lot of money to take care of because they are difficult and skillful work. These are the things that big companies are doing for you when you use their online services. The unwritten contract we engage in today is that we give big companies control over our digital life and our valuable personal data in exchange for their building, installing, and maintaining the infrastructure code on their servers.

An additional problem is that even in the "host your own server" model there is a lot of centralization and passing around of valuable and private personal information. For example obtaining a domain name and SSL certificate requires central authorities who almost always require a lot of personal information. Not to mention forking over cash. Additionally, even if you host your code on a VPS - as is the case with almost all server side code on the internet - then some trusted 3rd party has unlimited access to the machine you are renting.

Sketch of a tree

Some existing proposed solutions

So the interesting question is whether there is some way to make self-hosting possible for everybody in the same way that PCs and phones made "run the software I want" easy for everybody.

There was a time when a lot of web applications were deployed as simple PHP scripts with HTML, CSS & Javascript for the front-end. This was probably the simplest format for self-hosted applications as it was easy to find hosting providers who would let you run PHP, and installation was as simple as copying files up to a server. I think this is a big reason why PHP has remained popular for so long despite the fact that it's such an awful language with frequent security issues. It simply let you do what you want. These days though, a lot of modern software on the internet is not written in PHP and developers often prefer other langauge ecosystems.

Another idea for self-hosting was the Freedom Box pioneered by Free Software enthusiasts in the 2000s. The idea is pretty good but does not seem to have acheived widespread adoption. My guess is the use cases are still too obscure for most users. what does "run your own XMPP server" mean? Most people have no idea.

A lot of people today are of the mind that "blockchain" is the singular solution to the issue of decentralizing the software we run. In this model you pay in small increments for the server side code which runs on a distributed system shared amongst many participants. You maintain control of your piece of the distributed service and your data stored in it using cryptography. In this model the server component and app state is shared across many computers all at once and you basically pay for a small slice of it.

This is a fascinating idea but I am not completely convinced that "blockchain" will be the future of decentralized software applications because:

  • So far it is unproven in all except one application, that of decentralized digital gold.
  • There are a number of very successful decentralized systems that function well without any kind of token/monetary incentive (email, irc, http, bittorrent, git etc.) and that original model of decentralization by cooperating parties is still a sound, well proven one. People still use email.
  • The blockchain design encapsulates a trade-off of low efficiency for immense security and trust minimization. This means that blockchains are very good at storing and securing transaction data but they are not very efficient at storing lots of other types of data. Imagine having your own dropbox folders also filled with encrypted copies of those of everybody else in the world. It does not scale.

In addition to the above ideas there exist new ecosystems like ownCloud and Sandstorm which enable people to deploy their own "stacks" hosted on their own servers. These are noble efforts at decentralization and self-hosting into which many people have poured a lot of work. The problem with these "host-it-yourself cloud stacks" is that they often still require trust in some 3rd party hosting service, and quite deep technical knowledge. Sure, to a nerd it is easy to upload a PHP script and configure a MySQL server, purchase VPS hosting, install a Let's Encrypt SSL certificate, but it is not always so easy to a nerd's mother, father or friend.

Sketch of a tree

A simple platform for self-hosting

I've been thinking about all of this in recent years. I've been asking myself these questions:

What if running persistent server side code was available to anybody, even non-technical users? What if installing server side software was as click-tap easy as it is on a phone or PC? What if you could run cloud software and servers on the desktop PC in your office, or tablet, or Raspberry Pi on your home network? What does it look like when anybody in the world can self-host their "cloud" software?

Criteria for such a system would be:

  • Easy to set up and administer.
  • Easy to install and uninstall new services.
  • Interfaces easily with web browser client software.
  • Secure.
  • Completely user-controlled and decentralized.

As I've thought about these ideas I've also been tinkering with WebTorrent. It's a cool piece of technology which is bringing a chunk of the BitTorrent architecture to the web.

Working with this technology made me realise something the other day: it's now possible to host back-end services, or "servers" inside browser tabs.

Client-server model with WebRTC

Instead of a VPS server running in some data center, you have a browser tab running on a spare computer. Instead of clients talking to servers over HTTP they can talk over WebRTC. You build your "backend" service code with the same tech as your client side web app, with HTML and Javascript.

When I realised this it blew my mind. I couldn't stop thinking about it. Maybe the persistent browser tab on the home PC can be the new server for self-hosting users? What if grandma could build out a server as easily as opening a browser tab and leaving it running?

So anyway, I've made this weird thing to enable developers to build "backend" services which run in browser tabs. It's called Bugout and it has been fun to hack together.

Check it out and let me know what you think.

]]>
/tags/workblog Sun, 12 Aug 2018 12:38 GMT
Hash Function Attacks Illustrated entries/hash-function-attacks-illustrated https://mccormick.cx/news/entries/hash-function-attacks-illustrated Here are some illustrated explanations of the main ways in which cryptographic hash functions can be attacked, and be resistant to those attacks.

Zooko Wilcox's blog post Lessons From The History Of Attacks On Secure Hash Functions gives us a nice overview of these and I've quoted his concise explanations below. Check out his great post for more detail and history on this topic.

A cryptographic hash function is an important building block in the cryptographic systems that keep us safe in our communications on the internet.

A hash function takes some input data and generates a hopefully unique string of bits for each different input. The same input always generates the same result.

Diagram of an ideal hash function

The input to a secure hash function is called the pre-image and the output is called the image.

I use the following key below:

A red square

Red for inputs which can be varied by an attacker.

A green square

Green for inputs which can't be varied under the attack model.

To attack a hash function the variable inputs are generally iterated on in a random or semi-random brute-force manner.

Collision attack

Diagram of a collision attack on a hash function

A hash function collision is two different inputs (pre-images) which result in the same output. A hash function is collision-resistant if an adversary can't find any collision.

Pre-image attack

Diagram of a pre-image attack on a hash function

A hash function is pre-image resistant if, given an output (image), an adversary can't find any input (pre-image) which results in that output.

Second-pre-image attack

Diagram of a pre-image attack on a hash function

A hash function is second-pre-image resistant if, given one pre-image, an adversary can't find any other pre-image which results in the same image.

Hopefully these diagrams help to clarify how these attacks work!

Read more of my posts on the subject of cryptography.

]]>
/tags/workblog Mon, 30 Jul 2018 02:14 GMT
Talk: GNU/Linux in Tiny Places entries/talk-gnu-linux-in-tiny-places https://mccormick.cx/news/entries/talk-gnu-linux-in-tiny-places gnu-linux-in-tiny-places.jpg

Last night I gave a talk at a Perth Linux User's Group meetup about doing Linuxy stuff on small machines:

]]>
/tags/workblog Wed, 11 Jul 2018 08:10 GMT
Frock: Clojure-flavoured PHP entries/frock-clojure-flavoured-php https://mccormick.cx/news/entries/frock-clojure-flavoured-php Frock is a little experimental tool for writing PHP scripts using Clojure-like LISP syntax.

If you want to see what the code looks like, here's an example which fetches and lists top news items from the Hacker News API.

Some Frock code

Frock could be interesting to you if you are LISP or Clojure programmer writing a web application which is mostly front-end code, but which needs some small amount of server side logic for e.g. proxying, authentication, data persistence etc. and you want this application to be easily deployable by semi-technical users on commodity hosting.

Basically if your target audience is graphic designers, you like Clojure, and your backend requirements are slim, then you might be interested.

Why?

Pythagoras says no to Fava beans

PHP is an old server-side web development language which is simultaneously loathed by software developers everywhere, and also wildly popular and widely deployed. To reconcile this paradox let's take a look at some pros and cons of PHP.

Cons:

  • Ugly language semantics & features.
  • Dubious security record.
  • Much awful legacy code lying around.

Pros:

  • User-friendly app deployments (simply copy files to server).
  • Widely available on internet servers.
  • Mature language and ecosystem.
  • Excellent documentation.
  • Much useful tech bundled ("batteries included").

The pros make PHP quite democratic. It's very easy to install PHP code on widely available, cheap, commodity hosting. It's easy to get started writing PHP applications; the PHP binary comes pre-installed on OSX for example. PHP contains a lot of capabilities by default: zipping files, opening sockets, encryption, command execution.

Frock exists to make the language semantics and features less of a con for brace wrangling LISP heads, whilst retaining the wide deployment surface and other democratic features of PHP.

]]>
/tags/workblog Wed, 27 Jun 2018 08:51 GMT
Frock: Clojure-flavoured PHP entries/frock-clojure-like-php https://mccormick.cx/news/entries/frock-clojure-like-php Frock is a little experimental tool for writing PHP scripts using Clojure-like LISP syntax.

If you want to see what the code looks like, here's an example which fetches and lists top news items from the Hacker News API.

Some Frock code

Frock could be interesting to you if you are LISP or Clojure programmer writing a web application which is mostly front-end code, but which needs some small amount of server side logic for e.g. proxying, authentication, data persistence etc. and you want this application to be easily deployable by semi-technical users on commodity hosting.

Basically if your target audience is graphic designers, you like Clojure, and your backend requirements are slim, then you might be interested.

Why?

Pythagoras says no to Fava beans

PHP is an old server-side web development language which is simultaneously loathed by software developers everywhere, and also wildly popular and widely deployed. To reconcile this paradox let's take a look at some pros and cons of PHP.

Cons:

  • Ugly language semantics & features.
  • Dubious security record.
  • Much awful legacy code lying around.

Pros:

  • User-friendly app deployments (simply copy files to server).
  • Widely available on internet servers.
  • Mature language and ecosystem.
  • Excellent documentation.
  • Much useful tech bundled ("batteries included").

The pros make PHP quite democratic. It's very easy to install PHP code on widely available, cheap, commodity hosting. It's easy to get started writing PHP applications; the PHP binary comes pre-installed on OSX for example. PHP contains a lot of capabilities by default: zipping files, opening sockets, encryption, command execution.

Frock exists to make the language semantics and features less of a con for brace wrangling LISP heads, whilst retaining the wide deployment surface and other democratic features of PHP.

]]>
/tags/workblog Wed, 27 Jun 2018 08:51 GMT
Browser Blockchain in ClojureScript entries/browser-blockchain-in-clojurescript https://mccormick.cx/news/entries/browser-blockchain-in-clojurescript I built a little blockchain-in-a-browser in ClojureScript to help understand the underlying algorithms.

You can simulate a network of peers by opening multiple browser tabs. Each peer can mine blocks and make transactions independently and the resulting blockchain will resolve conflicts correctly across all tabs.

A blockchain works by laying down a chain of blocks of transaction data.

Bitcoin whitepaper SPV

Each block in the chain contains a cryptographic hash with two important properties:

  • It proves a link to the previous block.
  • It proves that difficult computational work has been done.

The proof-of-work is accomplished by iteratively updating a nonce until a low-probability hash is discovered.

These two properties mean a blockchain is digital amber.

Insect embedded in amber

If somebody wants to modify a transaction deep inside the amber it would be very difficult because they would have to re-create every layer of the blockchain by doing as much work as the original process required.

In my browser blockchain the hashing is implemented like this:

(hash-object [timestamp transactions previous-hash nonce])

As you can see the previous block's hash is included in the current block.

The hashing is performed iteratively in a loop until a hash with at least one byte of leading zeroes is found:

(loop [c 0]
  (let [candidate-block (make-block (now) transactions previous-hash new-index (make-nonce))]
    (if (not= (aget (candidate-block :hash) 0) 0)
      (recur (inc c))
      candidate-block)))
]]>
/tags/workblog Wed, 13 Jun 2018 01:30 GMT
Tiny Core Linux on Raspberry Pi entries/tiny-core-linux-on-raspberry-pi https://mccormick.cx/news/entries/tiny-core-linux-on-raspberry-pi Tiny Core Linux SSH screenshot

I'm discovering Tiny Core Linux, a beautifully executed, minimalist, immutable-by-default GNU/Linux distribution that runs nicely on the Raspberry Pi and other x86 and ARM devices.

The documentation is comprehensive but arcane so here are my field notes.

I'm using piCore the Raspberry Pi variant of TinyCore but most of these notes should work for other variants too. These notes assume you are already a GNU/Linux desktop user and know basic Linux command line fu.

Happy penguin jumping

I'm a huge fan of this distribution!

Official disk images.

Here's the official README for the Raspberry Pi version.

Write the disk image to your SD card

sudo dd if=piCore-9.0.3.img of=/dev/mmcblk0

Make sure you get your sd card's device (e.g. mmcblk0) correct so you don't overwrite the wrong disk!

Tip: you can see where dd is up to by sending it a signal:

sudo kill -SIGUSR1 `pgrep -f mmcblk0`

After this put the sd card into your Raspberry Pi and it'll boot Tiny Core Linux with an SSH server running so that you can access it remotely over the network.

Find the Pi on your network

find-pis: use this little script to find the Raspberry Pi on your network so that you can ssh to it.

Default login:

username: tc
password: piCore

Immutable by default

Tiny Core Linux is a perfect fit for Raspberry Pi because it does not write to disk by default. Note that between reboots if you want to persist any changes you've made, you need to do it manually with filetool.sh -b. You'll especially want to do this after first boot to persist the SSH server keys. This includes any changes you make to files in the home directory.

This lack of disk writes is a huge boon because it prevents issues that distributions like Raspbian have with SD card corruption when you power off the Raspberry Pi. Immutability is also just a good thing generally for building robust software systems.

Tiny Core Linux package repository

The packages are .tcz files which like most package formats are a zipped up collection of files with some metadata. You can browse the packages available to the Raspberry Pi distribution.

Various useful commands

Persist current state to disk:

filetool.sh -b

Where to specify additional default file locations for persistence:

/opt/.filetool.lst

Download and install packages (nodejs binary package in this example):

tce-load -wi node.tcz

Show which packages are currently installed:

tce-status -i

Install a package for doing C/C++ compilation & development:

tce-load -wi compiletc

Find current OS version info:

version

Search for and install packages with console UI:

tce

Where to put your own startup and shutdown scripts:

/opt/bootlocal.sh
/opt/shutdown.sh

Getting WiFi working

You'll want the packages wifi and firmware-rpi3-wireless and then reboot and set up a connection:

sudo wifi.sh

Don't forget to persist your changes.

Getting sound working

You'll want the alsa and alsa-utils packages.

I managed to compile Pure Data and output a basic test tone. I went into the src folder of Pd and issued make -f makefile.gnu to build it from the source after checking it out from git.

GPIO access

I managed to interface with the Raspberry Pi GPIO by installing node and then using the onoff package installed via npm. Lots of dev packages were required for a successful build (compiletc probably covers most of them). Each time npm threw an error I looked at what the error message was to determine which package to install next until it built successfully.

I managed to get an LED to blink on pin 16 with this code:

var Gpio = require('onoff').Gpio;
led = new Gpio(16, 'out');
led.writeSync(1);
c = 0
setInterval(function() { led.writeSync(Number(c = !c)); }, 1000);

I had to run node as root to get access to the GPIO.

Philosophical waxing

GNU/Linux has now been subsumed into the technological substrates of the world. It is found in most phones, servers and appliances that people interact with every day. There are probably 10 copies of Linux running in the median developed-world household. It has become infrastructure, like roads and wires and water pipes.

I don't see that many young people at GNU/Linux meetups these days. I don't think we GNU/Linux nerds are required by the world in the same way that we were before when it was sparkly and new.

It's exciting for me to find Tiny Core Linux which has re-ignited the spark of enthusiasm for this technology.

]]>
/tags/workblog Sat, 03 Feb 2018 05:34 GMT
Tech Nugget entries/tech-nugget https://mccormick.cx/news/entries/tech-nugget DSC_0413.JPG

Secret project.

]]>
/tags/workblog Mon, 29 Jan 2018 04:35 GMT
Lovelace Day 2017 entries/lovelace-day-2017 https://mccormick.cx/news/entries/lovelace-day-2017 A non-exhaustive list of technologies that I used this year which happen to be built by women:

I feel grateful to these people for enriching technology with their contributions.

]]>
/tags/workblog Wed, 11 Oct 2017 06:58 GMT
Clerk: Event Logging Web App entries/clerk-event-logging-web-app https://mccormick.cx/news/entries/clerk-event-logging-web-app Screenshot

There are times in life when you need to keep track of periodic events as they happen. For example when you have a new baby it is sometimes useful to keep track of when they feed, nappy changes, etc. Another example might be tracking how often you eat chocolate or drink beer.

Clerk is a simple self-hosted web application that I built which you can install to the home screen of your device (by doing "add to home screen" in your browser) or load up on your tablet or laptop. You can then keep track of simple events with two taps on your device - once to open the app and once to record the event.

Screenshot 2

For every event logged the event type, timestamp, and comment are stored in CSV files. Events are stored in individual CSV files - one file per event type. You can also download all CSVs stiched together with an extra column for the event name.

Features

  • Web based.
  • Easy to deploy.
  • Self-hosted & FLOSS.
  • Simple text based CSV format.
  • Allows multiple people to log events.
  • Mobile friendly - "Add to Home Screen" web-app.

Install

To require authentication, first create a password file:

htpasswd -c /path/to/.htpasswd username

Then copy ./example.htaccess to .htaccess and edit it.

Enjoy!

]]>
/tags/workblog Thu, 06 Jul 2017 02:38 GMT
A Smart Alternative to `watch make` entries/a-smart-alternative-to-watch-make https://mccormick.cx/news/entries/a-smart-alternative-to-watch-make watch-make is a script that rebuilds your project only when make detects it needs a rebuild, for example when source files change.

Features:

  • Works with any existing Makefile based project.
  • No dependencies apart from make.
  • Passes any arguments to make (such as -C mydir).
  • Silent if there is nothing to build and does not swallow output when there is.

I wrote it in response to this Stack Overflow question.

The source code is hosted on GitHub.

Install it

curl -s https://raw.githubusercontent.com/chr15m/watch-make/master/watch-make > ~/bin/watch-make
chmod 755 ~/bin/watch-make

Full source code

#!/bin/sh

while true;
do
  if ! make -q "$@";
  then
    echo "#-> Starting build: `date`"
    make "$@";
    echo "#-> Build complete."
  fi
  sleep 0.5;
done
]]>
/tags/workblog Wed, 31 May 2017 03:13 GMT
New York PdCon 2016 entries/new-york-pdcon-2016 https://mccormick.cx/news/entries/new-york-pdcon-2016 DSC_1662.JPG DSC_1673.JPG DSC_1731.JPG DSC_1701.JPG DSC_1716.JPG DSC_1739.JPG DSC_1864.JPG DSC_1742.JPG DSC_1680.JPG DSC_1690.JPG DSC_1698.JPG DSC_1859.JPG DSC_1793.JPG DSC_1801.JPG DSC_1818.JPG DSC_1788.JPG DSC_1810.JPG DSC_1724.JPG DSC_1862.JPG DSC_1713.JPG DSC_1747.JPG DSC_1743.JPG DSC_1745.JPG DSC_1877.JPG

In November I was in New York for PdCon 2016 and to visit my brother, thanks in large part to my friend Joe Deken and his not-for-profit, New Blankets.

The conference was fantastic. Many fascinating performances, a chance to catch up in person with people from the Pure Data community, and the opportunity to present and perform some of my own work. A highlight for me was hearing Miller Puckette, creator of Pure Data, talk about his approach and philosophy.

On top of that I got to catch up with some awesome people outside of the conference, especially my brother. We went hiking together one day - a rare opportunity to hang out together in nature.

]]>
/tags/workblog Mon, 09 Jan 2017 09:49 GMT
Startup Idea: Web App Installer entries/startup-idea-webapp-store https://mccormick.cx/news/entries/startup-idea-webapp-store Although it is an exadgeration to say that app stores are dead it is true that web apps (applications that run in a web browser) and the mobile-device browser platforms have become powerful to the point that it is often not neccessary to build a native app.

There is a gap in the web app paradigm as users don't always realise they can install a web app they use by going to "Add to homescreen".

An interesting way to fill that gap would be to build an "app store" for web apps. That is, a native app that curates and carries out the "add to homescreen" process for the user for a wide variety of quality web applications. So basically like Firefox OS but without the OS - just the installer, and for the native browser of each platform.

Ideas are cheap and execution is everything - you don't need my permission to take this idea and run with it if you're convinced.

]]>
/tags/workblog Mon, 14 Nov 2016 15:22 GMT
Delhi 2016 entries/delhi-2016 https://mccormick.cx/news/entries/delhi-2016 DSC_0232.JPG DSC_0242.JPG DSC_0251.JPG DSC_0304.JPG DSC_0326.JPG DSC_0350.JPG DSC_0358.JPG DSC_0217.JPG DSC_0411.JPG DSC_0212.JPG DSC_0432.JPG DSC_0453.JPG IMG_20160930_105853.JPG

Couple of weeks ago I was fortunate enough to visit Delhi and spend a week with my colleagues Umang, Gaurav and Tom. We had an excellent and productive week and in between development discussion Umang was kind enough to drive us to many fascinating and beautiful places - not least of all to enjoy a wonderful meal at his sister's house.

We worked out of Awfis co-working, which I recommend.

I feel lucky to have seen Delhi this way.

]]>
/tags/workblog Thu, 20 Oct 2016 06:57 GMT
Sci Fi UIs in ClojureScript entries/sci-fi-uis-in-clojurescript https://mccormick.cx/news/entries/sci-fi-uis-in-clojurescript I built these sci fi user interfaces using ClojureScript, React, and SVG:

Tap or click to interact with them.

 

 

 

 

 

More here.

Source code here.

]]>
/tags/workblog Sun, 11 Sep 2016 07:13 GMT
OMG Not Another TODO List Application entries/omg-not-another-todo-list-application https://mccormick.cx/news/entries/omg-not-another-todo-list-application My wife and I needed a collaborative shopping list that we could update from our phones. There are proprietary solutions to this but after some research I was surprised to discover that there is no Free Software application that meets the following criteria:

  • Web based.
  • Easy to deploy.
  • Self-hosted & FLOSS.
  • Allows multiple people to update a list.
  • Simple text based format for easy editing.
  • Mobile friendly - "Add to Home Screen" webapp.
  • Satisfies the single use-case of collaborative TODO editing.

Of course I built one with ClojureScript.

Screenshot of 
OMGNATA

We've been using this "in production" for 3 months and so far it fills our need without issue.

  • Authentication can be accomplished with a .htaccess file or similar.
  • The text-file format is designed so that you can edit lists with a text-editor directly if you want to.
  • If you want to support multiple users you can set up two instances in two different folders and symlink the textfile of the list you want to share between them. Each folder can have its own authentication.
  • You can also do other textfile things like make a symlink into a Syncthing folder which enables you to modify your TODO lists on your laptop or server as well as through the web app.

The realtime updating is accomplished via long-polling. Primarily I used this instead of websockets because when it comes to browsers, older tech is more robust to different operating environments than newer tech.

I resorted to using PHP for a very lightweight server backend because it has the property that basically anybody with web hosting is able to upload a PHP script and I think it's good to give software as egalitarian a deployment surface as possible. Luckily it is only 150 lines of not-too-painful PHP.

Click here to get the source and download/install it.

]]>
/tags/workblog Sat, 28 May 2016 01:29 GMT
miniCast web based podcast app entries/minicast-web-based-podcast-app https://mccormick.cx/news/entries/minicast-web-based-podcast-app screenshot.png

miniCast is a self-hosted web app for listening to podcasts.

I wrote it while learning ClojureScript. The back-end API is written in PHP. My friend George came up with the name and design.

]]>
/tags/workblog Sun, 25 Oct 2015 12:30 GMT
Live-coding Blender with Hy entries/live-coding-blender-with-hy https://mccormick.cx/news/entries/live-coding-blender-with-hy

Hy is a Clojure-like LISP that compiles to Python bytecode. Blender is a popular Free and Open Source 3d modelling program. This is a little livecoding experiment I put together with the two of them.

Source code is on GitHub.

]]>
/tags/workblog Wed, 17 Jun 2015 03:35 GMT
ImageSnapper Chrome Extension entries/imagesnapper-chrome-extension https://mccormick.cx/news/entries/imagesnapper-chrome-extension ImageSnapper icon

ImageSnapper is a Chrome browser extension I wrote a while back for generating an RSS feed of found images. When you right-click an image and select "Snap image to RSS feed" from the menu the image will be saved to a local folder (configurable) and added to the RSS feed in the same folder. Sync the folder to a web accessible location to publish your feed.

ImageSnapper screenshot

Source code is on GitHub.

Install from the Chrome web store.

]]>
/tags/workblog Sun, 05 Apr 2015 09:56 GMT
OMG Clojure entries/omg-clojure https://mccormick.cx/news/entries/omg-clojure Last weekend was Global GameJam. My buddy Crispin suggested we use Clojure, a non-traditional member of the LISP family running on the Java virtual machine, and its browser-friendly cousin ClojureScript.

Here is the game we built together, using graphics from Kenney.nl:


(click to play)

You can find the source code on GitHub.

A couple of weeks ago I installed Leiningen, the Clojure package/dependency manager, and the VIM plugins, and started practicing.

Literally every night for two weeks now I have been dreaming in parentheses. Writing code has never felt so comfortable. I think I may officially be a LISP convert.

]]>
/tags/workblog Thu, 29 Jan 2015 12:25 GMT
GBA GPIO tester entries/gba-gpio-tester https://mccormick.cx/news/entries/gba-gpio-tester Here is a bit of code I wrote recently for the Gameboy Advance platform. It lets you use the device's buttons to toggle GPIO pins.

animation of the code in action

Recently I ressurrected LooperAdvance, a bit of 10 year old code for making music with a Gameboy Advance. I am hacking in synchronisation so that I can sync up the loops with Pure Data running on a Raspberry Pi.

The Gameboy Advance has a pretty robust little 4-bit GPIO rated at 3.3v that could be neat for controlling some projects now that you can get them quite cheaply second hand. There are also wireless adapters available that will instantly give you 4 bits of output by remote control.

]]>
/tags/workblog Tue, 16 Dec 2014 01:58 GMT
RSS Image Feed Picture Frame entries/rss-image-feed-picture-frame https://mccormick.cx/news/entries/rss-image-feed-picture-frame I should probably post more on here about the thing I spend most of my time doing - writing code.

RSS Image Feed Picture Frame is a little piece of software I finished recently. You give it a list of RSS/tumblr feeds that have nice graphics as their content and it will display a random image from one of the feeds. There is a small daemon which displays just the images full screen and fades between them.

The idea is to run it as a lightweight "picture frame" app for example on a Raspberry Pi powered HDMI screen hanging on your wall.

]]>
/tags/workblog Tue, 14 Oct 2014 12:14 GMT
Ciao Torino entries/ciao-torino https://mccormick.cx/news/entries/ciao-torino IMG_20140603_194131.jpg

This is the new Pebble watch-face I've been developing - "Torino". I think it is finished now. I am naming it after the lovely city we visited last weekend.

Information displayed on the watch-face:

  • Current time.
  • Name of your present locale.
  • Custom text from a file on your server called message.txt (here that is the number visible below the locale).
  • Current weather conditions and temperature.
  • Current week day, day of the month, and month name.

Tap here on your device to download the Torino watch-face onto your Pebble.

Help yourself to the GPL licensed source code. Edit the file src/build_config.h if you want customise the settings. I'd especially appreciate it if you run your own copy of the PHP script and change the URL in the build config to reflect your own server so as not to place load on my server. That also lets you set up your own scripts to write whatever information you like into message.txt on your server. There is a demo script for displaying unread mail counts in that space, for example.

Maker Faire Torino

Had a great time and met some lovely people, gave a small performance with my GarageAcidLab Pd patch.

The QA session between Bruce Sterling and Massimo Banzi was hilarious. After hearing them talk I feel like I should stick a little sundial and lodestone compass on my watch just in case. Probably not a bad idea.

If you aren't already familiar with Bruce Sterling's work you should really check him out. My friend Fenris got me into his work some years ago. Way ahead of the curve for several decades now. I feel very fortunate I got to meet Bruce in Torino, and he was tremendously nice, showing me some amazing gesture controlled visualisations on his laptop and signing a maker-scene object for my friend. I'm kicking myself I didn't ask him a lot more questions.

]]>
/tags/workblog Fri, 06 Jun 2014 19:52 GMT
My Own Code Running On My Wristwatch entries/my-own-code-running-on-my-wristwatch https://mccormick.cx/news/entries/my-own-code-running-on-my-wristwatch Photograph of my Pebble wristwatch

I have not worn a wristwatch for more than a decade. Feels pretty futuristic.

Here is the source code for my custom Pebble app, which is a forked version of another app that fetches information from my server and displays it to me. Notifications are by a different app.

Of course, the compiler is Richard Stallman's.

]]>
/tags/workblog Fri, 25 Apr 2014 20:33 GMT
3d Printable Sixteen Sided Hexadecimal Die entries/3d-printable-sixteen-sided-hexadecimal-die https://mccormick.cx/news/entries/3d-printable-sixteen-sided-hexadecimal-die Hexadecimal Die

This is a Hexadecimal die that you can 3d print. Download the STL file or get the source code on GitHub. Hopefully useful when generating private keys and the like.

]]>
/tags/workblog Mon, 27 Jan 2014 14:04 GMT
A Better Xubuntu Lockscreen entries/a-better-xubuntu-lockscreen https://mccormick.cx/news/entries/a-better-xubuntu-lockscreen Here is how you can have a lockscreen on Xubuntu that just displays a custom logo/graphic and asks for your password. The graphic I have used is a white logo on black background, so the background/foreground settings reflect that but you can change these colours to match whatever graphic you use.

sketch of a padlock

First, you need to create a script in $HOME/bin/xflock4 - this should override the default xflock4 script on your system that XFCE uses as a proxy for launching the lock screen. Here are the contents, which just run the xlock program:

#!/bin/sh
xlock

Don't forget to make it executable with chmod 755 ~/bin/xflock4.

Next, you should create an xpm version of the graphic that you want to use, for example in $HOME/.lock.xpm - you can use the Imagemagick convert program (sudo apt-get install imagemagick) to convert some other image into the correct xpm format like this:

convert myimage.png ~/.xlock.xpm

Make a note of the dimensions of your image as you will need them for the 'icongeometry' option below. The largest size appears to be 256x256 pixels and the man page advises to make it as close to square as possible. You will set the 'image.bitmap' setting below to the location of the xpm file. Finally, you should add the following configuration options to a file called $HOME/.Xdefaults:

XLock*mode: image
XLock*image.bitmap: /home/chrism/.lock.xpm
XLock*image.count: 1
XLock*image.erasedelay: 0
XLock*erasedelay: 0
XLock*icongeometry: 180x180
XLock*background: Black
XLock*foreground: White
XLock*description: off
XLock*info:

You can find out other options by running man xlock.

To test select 'Lock Screen' from the user menu.

Enjoy!

]]>
/tags/workblog Tue, 17 Dec 2013 08:13 GMT
Script to Fetch BitBucket Tasks as Text entries/script-to-fetch-bitbucket-tasks-as-text https://mccormick.cx/news/entries/script-to-fetch-bitbucket-tasks-as-text sunset-newhouse.jpg

These days I am using Bitbucket for the git repository hosting and task management on my commercial projects. One thing I often need to do is fetch a list of current tasks so that I can update a client with what is on our current agenda for them, get feedback about priorities etc. Here's a short Python script I hacked together to do that based on some other public domain scripts I found out there:

import base64
import cookielib
import urllib2
import json

class API:
    api_url = 'http://api.bitbucket.org/1.0/'

    def __init__(self, username, password, proxy=None):
        encodedstring = base64.encodestring("%s:%s" % (username, password))[:-1]
        self._auth = "Basic %s" % encodedstring
        self._opener = self._create_opener(proxy)

    def _create_opener(self, proxy=None):
        cj = cookielib.LWPCookieJar()
        cookie_handler = urllib2.HTTPCookieProcessor(cj)
        if proxy:
            proxy_handler = urllib2.ProxyHandler(proxy)
            opener = urllib2.build_opener(cookie_handler, proxy_handler)
        else:
            opener = urllib2.build_opener(cookie_handler)
        return opener

    def get_issues(self, username, repository, arguments):
        query_url = self.api_url + 'repositories/%s/%s/issues/' % (username, repository)
        if arguments:
            query_url += "?" + "&".join(["=".join(a) for a in arguments])
        try:
            req = urllib2.Request(query_url, None, {"Authorization": self._auth})
            handler = self._opener.open(req)
        except urllib2.HTTPError, e:
            print e.headers
            raise e
        return json.load(handler)

if __name__ == "__main__":
    import sys
    if len(sys.argv) < 5:
        print "Usage: %s username password baseuser repository" % (sys.argv[0],)
    else:
        result = API(sys.argv[1], sys.argv[2]).get_issues(sys.argv[3], sys.argv[4], (("status", "new"), ("status", "open"), ("limit", "50")))
        for p in result["issues"]:
            print " *%s %s" % (p.has_key("responsible") and "**" or "", p["title"])
            #print p["content"]
            print

Run it to get a usage message.

I secretly wish that bzr had won the distributed version control wars because of its superior user interface, but these days I am resigned to using git because pretty much everybody I have to inter-operate with is using it. It's not that bad.

]]>
/tags/workblog Mon, 18 Mar 2013 02:02 GMT
Isolated Browsing with Chrome entries/isolated-browsing-with-chrome https://mccormick.cx/news/entries/isolated-browsing-with-chrome Pac-Man Ghost

You can run a single-use instance or site specific browser using Chrome web browser with it's own self-contained user data on GNU/Linux like this:

chromium-browser --app=https://plus.google.com/ --user-data-dir=$HOME/.chromium-googleplus

This launches Chrome in an 'app' style window with just that particular web app visible (in this case, Google Plus) and no location bar, navigation buttons, etc. It isolates the browsing session from your other browsing preventing e.g. social networking sites from tracking you too effectively. If you are logged into Facebook in a regular browser for example, that company can tell every site you visit that contains a Facebook 'like' button or Facebook comments section. They are effectively gathering a comprehensive browsing history of every Facebook user who remains logged in. Do you want Facebook to know your browsing history?

It should be possible to do something very similar with Chrome for Windows and a batch file.

On Mac you can use Fluid if you want to pay for ease of use, or the same trick as above with Chrome by accessing the binary directly e.g. something like /Applications/Google Chrome.app/Contents/MacOS/Google Chrome --app=https://plus.google.com/ --user-data-dir=$HOME/.chrome-googleplus. Save that as a shell script for one-click access.

If you want to further anonymise your browsing (e.g. obscure your IP address for example) you might consider using software like Tor.

]]>
/tags/workblog Wed, 20 Feb 2013 08:06 GMT
Optimise HTML5 Apps For Mobile WebKit entries/optimise-html5-apps-for-mobile-webkit https://mccormick.cx/news/entries/optimise-html5-apps-for-mobile-webkit
  • Use zepto.js not jQuery.
  • Use -webkit-transition and -webkit-transform wherever possible.
  • Especially useful if you are developing with PhoneGap/Cordova on iPad and iPhone, or Android. Those webkit transforms saved my bacon on a recent iPad project. Here is the basic CSS you want to put on an element you want to optimise:

    -webkit-transition: -webkit-transform 0ms;
    -webkit-backface-visibility: hidden;
    

    Try replacing $(this).hide(); with a transform that moves the element off screen like this (might need overflow: hidden on your body tag):

    $(this).css("-webkit-transform", "translateX(1500px)");
    

    Then when you want to show the element again do this instead:

    $(this).css("-webkit-transform", "translateX(0px)");
    

    I also had great success replacing jQuery UI drag-and-drop code with something hand-rolled:

    $(this).css("-webkit-transform", "translateX(" + relativeX + "px) translateY(" + relativeY + "px)");
    

    Hope this helps you!

    ]]>
    /tags/workblog Mon, 10 Dec 2012 06:50 GMT
    My first Makerbot print entries/my-first-makerbot-print https://mccormick.cx/news/entries/my-first-makerbot-print IMG_20121115_181922.jpg

    My first makerbot print. Felt like the first time Mum and Dad brought home the Apple IIe. Everything feels different today than it was yesterday. Thanks so much to Mike for letting me (ab)use his makerbot!

    The lego man is from Thingiverse. The arcade machine is a hack of one on sketchup warehouse.

    ]]>
    /tags/workblog Fri, 16 Nov 2012 01:53 GMT