Introducing Huey!

September 1, 2023

Today I’d like to share with you Huey, already up and running.

Huey is in an ongoing experiment of mine, and it is designed to be a news reader and archiver.

This software is still in its early stages. You get to see it as I’m building it. I’m not aiming at this point to make it a product, but I’d like to get it to a usable and interesting state.

The name takes inspiration from the name of the band “Huey Lewis and The News”.

Sources are available at GitHub, along with a project tracker with next steps. Feedback is more than welcomed.

Another feed reader?

Quite so. Huey was born on the ashes of a previous experiment of mine called LiteRSS, and hopefully captures a bit better on what I learned building it.

Around 5 years ago I found myself commuting in a low cell signal area. Reading JavaScript and asset heavy blogs became impossible, so I thought I’d build a tool that would extract the relevant bits server-side, extracting all the gunk like ads and assets for meaningless features of style and delivering only the essential.

The interface was made to be incredibly simple, mimicking CNN Lite’s. Unlike all other feed readers, there were no categories, distinction between read and unread (safe for blue and purple links) and, what I came to enjoy the most: no unread counts anywhere. Just a list of chronologically sorted news items that I could chose to visit or scroll past.

That made my experience of reading much more pleasurable and less prone to anxiety (still 300 items to read!!). This was a nice surprise, and set me off to start researching a bit on whether anyone else had got to similar conclusions.

I’ve found out that Dave Winer, one of the designers of RSS, has being doing some work around this topic for quite a while now. He calls this approach “Rivers”, and he maintains quite a few. While our ideas around it are not fully identical, they differ in very little. The current implementation of Huey doesn’t yet match what I’m looking for, but we’ll get there.

I also wondered for a while on how RSS readers would have evolved if not drove to almost irrelevance by social networks. Looking back, all readers seemed to mimic the Usenet and BBS experience. Every one of them looked like a slightly odd email client, with folders and unread counts. The thing with electronic correspondence is, though, that there’s a natural incentive to read all of it, or at least to sieve through it. If a reader looks and behaves like an email client it will end up sending the user the same signals, and start flagging news articles as things that require immediate attention. All of them. As the web grew in size so did the number of items to read, and as such the pressure to read everything rose as well, which made this UX paradigm unbearable for me. A commitment free feed made it easier to digest this deluge of items.

Of course that there are still tricks to employ when viewing news articles as a unread-less timeline: some sources output more news articles way more than others, so there’s a need need a way to quickly go through these items without it disrupting flow. Also, there may be sources that are valued a lot and output very little, so those will have to be brought front and centre.

The sorting algorithm is chronological for now, but I’m not against the idea of start using “AI” elements to help sort out the timeline, like clustering and relevance sorting for possibly a more interesting River (still to be determined). I personally don’t think the “algorithm” is the enemy, just the lack of control over it (corporate recommendation algorithms are not actually listening to your signals, but rather serving as profit maximisers for the companies that build them), and the lack of choice there is over using it or not, which is something I’ll look for when building Huey.

Your own personal internet (archive)

I’ve become increasingly and particularly interested in keeping personal records of news articles, blog posts and other internet content I’ve come across. These last years have given us a glimpse on how the internet can be ephemeral, as company owners change enforcing new policies on content (and deleting old content that doesn’t match new guidelines), or producers simply change URL schemes. Old content disappears, either from existence altogether or by simply becoming unresolvable. Other times people just move on with their lives and abandon their blogs. Content disappears, and as this decade moves forward, I fear this effect will only amplify.

While Internet archives already exist, what I’d be interested in would be helping them preserve digital assets while at the same time keeping a copy for my personal records. I plan to have Huey do this as it gathers new URLs from RSS and Atom Feeds, and other sources that will be implemented meanwhile.

I’d like to be able to process and reprocess the data for my own personal usage. The most obvious case is for building and improving a personal content search engine, for that blog post that was read a long time ago but had incredible insights, or to check what were the news on the birthday of someone important, or to do generic research.

A small aside here: I’d love that digital newspapers had some sort of searchable archive of all their news of all time. Not necessarily to scrape and add to my personal archive. I think it would be a researching journalist’s dream come true, especially now with new AI-based tools that detect references to people, places and events more accurately and may help trace timelines and surface interesting stories. But again, I digress.

LiteRSS did a little of this, as it would try to scrape the content of articles it would come across. But this scraping was destructive, keeping only unformatted text and a couple of images. I’ve since changed my approach in Huey: entries in the timeline will lead you to the source of information directly instead of showing you a mangled version of it. After all, form and content go hand in hand. My plans with Huey at present include just adding a reference to the personal archive item for each entry, and use a small subset of text or other elements to make it searchable, thus creating a semantically searchable personal archive of the internet.

What’s next

I’ve got several ideas on what I’d like to do:

Improve the timeline to make reading more enjoyable, as right now the timeline is just a list of links with a preview image
Fetch information from more sources of information, such as ActivityPub feeds or timelines, or specialised APIs such as HN or lobste.rs for added meta-data
- Make use of that information to start building some real-time intel on the articles, by understanding the ones that are posted more frequently or talked about
Archiving and search capabilities

Development shall continue at a hobby-project-like pace. Since this is a side thing, without any commitments to uptime or stability.

More updates coming soon.