I keep a list of little neat projects that I think some programmers might find fun to write for themselves, if not actually use all the time. I figure I’ll list out what they are and give maybe quick blurbs about why I think they’re neat projects and some ideas for how to tackle them. I’ll do a few more write-ups with more stuff from that list from time to time.
Some projects (all web-based):
One of the best tools I used to have a copy of was Microsoft OneNote, but for the game development my friends and I used to do we were big fans of Dokuwiki.
Anyways, ignoring the OG wiki wiki web, the feature list of a modern wiki (say, Mediawiki) looks something like:
This seems like a lot, but developing a wiki for personal use doesn’t require nearly that much stuff. A minimal feature list for one might look like:
Now, with those two things, you can imitate most of what you’ll use on a daily basis. Since we’re not supporting multiple users, if we’re a little simple in our usage we can get away with basic HTML forms–no need for websockets or long-polling or whatever.
For linking, the functionality we want is the ability to define a link inside the wiki, and then have it automatically link to a “create a page” form if the page doesn’t exist, or to the page if it does.
The way to do that is to generate the page HTML for everything, crawl and collect metadata on all the pages, and then rewrite all the HTML links to point at the correct thing (external link, internal stub link, or internal page) before finally writing everything to disk (or database or wherever you think it should live).
Similarly, we can use the HTML transform trick to handle things like downloading and inlining images for use in pages (done in my blog generator), discovering footnotes and appending them, and all sorts of other things.
The humble personal wiki itself, though, starts with those two basic features.
Wordpress is the 800-lb gorilla of the blogging realm, closely followed by Medium. If you like hosting your own files though (and you should, because the web is meant to be federated, but that’s another story entirely) you’ve probably looked at Octopress or Jekyll or something. These platforms offer features like:
Again, you can throw away almost all of that in order to get a small feature set that might take a day or two to make:
Most of that can be handled really easily, and in fact you can get away with just bash and pandoc for the Markdown conversion.
Something that will trip you up is that, once you want to create any sort of interesting indices, page headers or footers linking to other posts, tag indices, or whatever else you’ll discover that you need to actually write code to collect the metadata for each post and then use that to generate those other resources. That will probably mean you need to figure out how you want to attach metadata to a post–I use something similar to Octopress where I store a YAML-like blob at the top of my file, but you could use a dedicated separate JSON-blob or YAML document with the same filename as the post and a different extension, or whatever else.
For comments, you might want to embed something like Disqus but I’ve found that if you really need feedback then an email address is probably better.
So, everybody needs a pet spider. Big commercial search engines like Google or crawlers like 80legs have a whole bunch of features, obviously:
And that’s all like table-stakes. As you can probably guess, a simple little pet spider doesn’t need to do nearly that much.
A pet spider probably wants to know how to:
This project probably wants a little database (say, SQLlite or PG) for storing the queue and skipping redundant crawls, but you could conceivably just use a flat text file and some bash trickery to check for duplicates. An MVP for this might well look like a bash script with curl and pup and a bit of cleverness.
Obviously, as time goes on, you’ll want to festoon your little spider with gear and features so it can have better luck on the World Wide Web, but making one to start with isn’t too gnarly.
I believe that almost every developer is skilled enough to tackle these sorts of things, if they only tried. We look at the big players online today and wonder how we can ever live in a world not dictated by their systems. Google doesn’t have to own search. Wikipedia doesn’t have to own knowledge. Medium doesn’t have to own self-indulgent screeds.
To make a better world, we need to empower users. To empower users, we need to educate them to be autonomous. To be autonomous, they have to seize the means of production.
My hope is that, by doing little projects like these, you can realize your potential to be autonomous of one degree or another from those orgs and feel capable of bring the fire down to the masses and giving them a chance to be autonomous too.