SimplePie: Demo

Or try one of the following: 詹姆斯.com, adult swim, Afterdawn, Ajaxian, Andy Budd, Ask a Ninja, AtomEnabled.org, BBC News, BBC Arabic, BBC China, BBC Russia, Brent Simmons, Channel Frederator, CNN, Digg, Diggnation, Flickr, Google News, Google Video, Harvard Law, Hebrew Language, InfoWorld, iTunes, Japanese Language, Korean Language, mir.aculo.us, Movie Trailers, Newspond, Nick Bradbury, OK/Cancel, OS News, Phil Ringnalda, Photoshop Videocast, reddit, Romanian Language, Russian Language, Ryan Parman, Traditional Chinese Language, Technorati, Tim Bray, TUAW, TVgasm, UNEASYsilence, Web 2.0 Show, Windows Vista Blog, XKCD, Yahoo! News, You Tube, Zeldman

ongoing fragmented essay by Tim Bray

Why Not Bluesky 15 Nov 2024, 9:00 pm

As a dangerous and evil man drives people away from Xitter, many stories are talking up Bluesky as the destination for the diaspora. This piece explains why I kind of like Bluesky but, for the moment, have no intention of moving my online social life away from the Fediverse.

(By “Fediverse” I mean the social network built around the ActivityPub protocol, which for most people means Mastodon.)

If we’re gonna judge social-network alternatives, here are three criteria that, for me, really matter: Technology, culture, and money.

I don’t think that’s controversial. But this is: Those are in increasing order of importance. At this point in time, I don’t think the technology matters at all, and money matters more than all the others put together. Here’s why.

Technology

Mastodon and the rest of the fediverse rely on ActivityPub implementations. Bluesky relies on the AT Protocol, of which so far there’s only one serious implementation.

Both of these protocols are good enough. We know this is true because both are actually working at scale, providing good and reliable experiences to large numbers of people. It’s reasonable to worry what happens when you get to billions of users and also about which is more expensive to operate. But speaking as someone who spent decades in software and saw it from the inside at Google and AWS, I say: meh. My profession knows how to make this shit work and work at scale. Neither alternative is going to fail, or to trounce its competition, because of technology.

I could write many paragraphs about the competing nice features and problems of the competing platforms, and many people have. But it doesn’t matter that much because they’re both OK.

Culture

At the moment, Bluesky seems, generally speaking, to be more fun. The Fediverse is kind of lefty and geeky and queer. The unfortunate Mastodon culture of two years ago (“Ewww, you want us to have better tools and be more popular? Go away!”) seems to have mostly faded out. But the Fediverse doesn’t have much in the way of celebrities shitposting about the meme-du-jour. In fact it’s definitely celebrity-lite.

I enjoy both cultural flavors, but find Fedi quite a lot more conversational. There are others who find the opposite.

More important, I don’t think either culture is set in stone, or has lost the potential to grow in multiple new, interesting directions.

Money

Here’s the thing. Whatever you think of capitalism, the evidence is overwhelming: Social networks with a single proprietor have trouble with long-term survival, and those do survive have trouble with user-experience quality: see Enshittification.

The evidence is also perfectly clear that it doesn’t have to be this way. The original social network, email, is now into its sixth decade of vigorous life. It ain’t perfect but it is essential, and not in any serious danger.

The single crucial difference between email and all those other networks — maybe the only significant difference — is that nobody owns or controls it. If you have a deployment that can speak the languages of IMAP and SMTP and the many anti-spam tools, you are de facto part of the global email social network.

The definitive essay on this question is Mike Masnick’s Protocols, Not Platforms: A Technological Approach to Free Speech. (Mike is now on Bluesky’s Board of Directors.)

What does success look like?

My bet for the future (and I think it’s the only one with a chance) is a global protocol-based conversation with many thousands of individual service providers, many of which aren’t profit-oriented businesses. One of them could be your local Buddhist temple, and another could be Facebook. The possibilities are endless: Universities, government departments, political parties, advocacy organizations, sports teams, and, yes, tech companies.

It’s obvious to me that the Fediverse has the potential to become just this. Because it’s most of the way there already.

Could Bluesky? Well, maybe. As far as I can tell, the underlying AT Protocol is non-proprietary and free for anyone to build on. Which means that it’s not impossible. But at the moment, the service and the app are developed and operated by “Bluesky Social, PBC”. In practice, if that company fails, the app and the network go away. Here’s a bit of Bluesky dialogue:

Bluesky dialog between myself and @mmasnick

In practice, “Bsky corp” is not in immediate danger of hard times. Their team is much larger than Mastodon’s and on October 24th they announced they’d received $15M in funding, which should buy them at least a year.

But that isn’t entirely good news. The firm that led the investment is seriously sketchy, with strong MAGA and cryptocurrency connections.

The real problem, in my mind, isn’t in the nature of this particular Venture-Capital operation. Because the whole raison-d’etre of Venture Capital is to make money for the “Limited Partners” who provide the capital. Since VC investments are high-risk, most are expected to fail, and the ones that succeed have to exhibit exceptional revenue growth and profitability. Which is a direct path to the problems of survival and product quality that I mentioned above.

Having said that, the investment announcement is full of soothing words about focus on serving the user and denials that they’ll go down the corrupt and broken crypto road. I would like to believe that, but it’s really difficult.

To be clear, I’m a fan of the Bluesky leadership and engineering team. With the VC money as fuel, I expect their next 12 months or so to be golden, with lots of groovy features and mind-blowing growth. But that’s not what I’ll be watching.

I’ll be looking for ecosystem growth in directions that enable survival independent of the company. In the way that email is independent of any technology provider or network operator.

Just like Mastodon and the Fediverse already are.

Yes, in comparison to Bluesky, Mastodon has a smaller development team and slower growth and fewer celebrities and less buzz. It’s supported by Patreon donations and volunteer labor. And in the case of my own registered co-operative instance CoSocial.ca, membership dues of $50/year.

Think of the Fediverse not as just one organism, but a population of mammals, scurrying around the ankles of the bigger and richer alternatives. And when those alternatives enshittify or fall to earth, the Fediversians will still be there. That’s why it’s where my social-media energy is still going.

On the Fediverse you can follow a hashtag and I’m subscribed to #Bluesky, which means a whole lot of smart, passionate writing on the subject has been coming across my radar. If you’re interested enough to have read to the bottom of this piece, I bet one or more of these will reward an investment of your time:

Maybe Bluesky has “won”, by Gavin Anderegg, goes deep on the trade-offs around Bluesky’s AT Protocol and shares my concern about money.
Blue Sky Mine, by Rob Horning, ignores technology and wonders about the future of text-centric social media and is optimistic about Bluesky.
Does Bluesky have the juice?, by Max Read, is kind of cynical but says smart things about the wave of people currently landing on Bluesky.
The Great Migration to Bluesky Gives Me Hope for the Future of the Internet, by Jason Koebler over at 404 Media, is super-optimistic: “Bluesky feels more vibrant and more filled with real humans than any other social media network on the internet has felt in a very long time.” He also wonders out loud if Threads’ flirtation with Mastodon has been damaging. Hmm.
And finally there’s Cory Doctorow, probably the leading thinker about the existential conflict between capitalism and life online, with Bluesky and enshittification. This is the one to read if you’re thinking that I’m overthinking and over-worrying about a product that is actually pretty nice and currently doing pretty well. If you don’t know what a “Ulysses Pact” is, you should read up and learn about it. Strong stuff.

Privacy, Why? 14 Nov 2024, 9:00 pm

They’re listening to us too much, and watching too. We’re not happy about it. The feeling is appropriate but we’ve been unclear about why we feel it.

[Note: This is adapted from a piece called Privacy Primer that I published on Medium in 2013. I did this mostly because Medium was new and shiny then and I wanted to try it out. But I’ve repeatedly wanted to refer to it and then when I looked, wanted to fix it up a little, so I’ve migrated it back to its natural home on the blog.]

This causes two problems: First, people worry that they’re being unreasonable or paranoid or something (they’re not). Second, we lack the right rhetoric (in the formal sense; language aimed at convincing others) for the occasions when we find ourselves talking to the unworried, or to law-enforcement officials, or to the public servants minding the legal framework that empowers the watchers.

The reason I’m writing this is to shoot holes in the “If you haven’t done anything wrong, don’t worry” story. Because it’s deeply broken and we need to refute it efficiently if we’re going to make any progress.

Privacy is a gift of civilization

Living in a civilized country means you don’t have to poop in a ditch, you don’t have to fetch water from the well or firewood from the forest, and you don’t have to share details of your personal life. It is a huge gift of civilization that behind your front door you need not care what people think about how you dress, how you sleep, or how you cook. And that when communicating with friends and colleagues and loved ones, you need not care what anyone thinks unless you’ve invited them to the conversation.

Photo credit: Beyond My Ken, via Wikimedia Commons

Privacy doesn’t need any more justification. It’s a quality-of-life thing and needs no further defense. We and generations of ancestors have worked hard to build a civilized society and one of the rewards is that often, we can relax and just be our private selves. So we should resist anyone who wants to take that away.

Bad people

The public servants and private surveillance-capitalists who are doing the watching are, at the end of the day, people. Mostly honorable and honest; but some proportion will always be crooked or insane or just bad people; no higher than in the general population, but never zero. I don’t think Canada, where I live, is worse than anywhere else, but we see a pretty steady flow of police brutality and corruption stories. And advertising is not a profession built around integrity. These are facts of life.

Given this, it’s unreasonable to give people the ability to spy on us without factoring in checks and balances to keep the rogues among them from wreaking havoc.

“But this stuff isn’t controversial”

You might think that your communications are definitely not suspicious or sketchy, and in fact boring, and so why should you want privacy or take any effort to have it?

Because you’re forgetting about the people who do need privacy. If only the “suspicious” stuff is made private, then our adversaries will assume that anything that’s private must be suspicious. That endangers our basic civilizational privacy privilege and isn’t a place we want to be.

Talking points for everyday use

First, it’s OK to say “I don’t want to be watched”; no justification is necessary. Second, as a matter of civic hygiene, we need to be regulating our watchers, watching out for individual rogues and corrupt cultures.

So it’s OK to demand privacy by default; to fight back against those who would commandeer the Internet; and (especially) to use politics to empower the watchers’ watchers; make their political regulators at least as frightened of the voters as of the enemy.

That’s the reasonable point of view. It’s the surveillance-culture people who want to abridge your privacy who are being unreasonable.

C2PA Progress 29 Oct 2024, 8:00 pm

I took a picture looking down a lane at sunset and liked the way it came out, so I prettied it up a bit in Lightroom to post on Mastodon. When I exported the JPG, I was suddenly in the world of C2PA, so here’s a report on progress and problems. This article is a bit on the geeky side but I think the most interesting bits concern policy issues. So if you’re interested in online truth and disinformation you might want to read on.

If you don’t know what “C2PA” is, I immodestly think my introduction is a decent place to start. Tl;dr: Verifiable provenance for online media files. If for some reason you think “That can’t possibly work”, please go read my intro.

Here’s the Lightroom photo-export dialog that got my attention:

There’s interesting stuff in that dialog. First, it’s “Early Access”, and I hope that means not fixed in stone, because there are issues (not just the obvious typo); I’ll get to them.

Where’s the data?

There’s a choice of where to put the C2PA data (if you want any): Right there in the image, in “Content Credentials Cloud” (let’s say CCC), or both. That CCC stuff is (weakly) explained here — scroll down to “How are Content Credentials stored and recovered?” I think storing the C2PA data in an online service rather than in the photo is an OK idea — doesn’t weaken the verifiability story I think, although as a blogger I might be happier if it were stored here on the blog? This whole area is work in progress.

What surprised me on that Adobe CCC page was the suggestion that you might be able to recover the C2PA data about a picture from which it had been stripped. Obviously this could be a very bad thing if you’d stripped that data for a good reason.

I’m wondering what other fields you could search on in CCC… could you find pictures if you knew what camera they were shot with, on some particular date? Lots of complicated policy issues here.

Also there’s the matter of size: The raw JPG of the picture is 346K, which balloons to 582K with the C2PA. Which doesn’t bother me in the slightest, but if I were serving millions of pictures per day it would.

Who provided the picture?

I maintain that the single most important thing about C2PA isn’t recording what camera or software was used, it’s identifying who the source of the picture is. Because, living online, your decisions on what to believe are going to rely heavily on who to believe. So what does Lightroom’s C2PA feature offer?

First, it asserts that the picture is by “Timothy Bray”; notice that that value is hardwired and I can’t change it. Second, that there’s a connected account at Instagram. In the C2PA, these assertions are signed with an Adobe-issued certificate, which is to say Adobe thinks you should believe them.

Let’s look at both. Adobe is willing to sign off on the author being “Timothy Bray”, but they know a lot more about me; my email, and that I’ve been a paying customer for years. Acknowledging my name is nice but it’d be really unsurprising if they have another Tim Bray among their millions of customers. And suppose my name was Jane Smith or some such.

It’d be well within Adobe’s powers to become an identity provider and give me a permanent ID like “https://id.adobe.com/timbray0351”, and include that in the C2PA. Which would be way more useful to establish provenance, but then Adobe Legal would want to take a very close look at what they’d be getting themselves into.

But maybe that’s OK, because it’s offering to include my “Connected” Instagram account, https://www.instagram.com/twbray. By “connected” they mean that Lightroom went through an OAuth dance with Meta and I had to authorize either giving Insta access to Adobe or Adobe to Insta, I forget which. Anyhow, that OAuth stuff works. Adobe really truly knows that I control that Insta ID and they can cheerfully sign off on that fact.

They also offered me the choice of Behance, Xitter, and LinkedIn.

I’ll be honest: This excites me. If I really want to establish confidence that this picture is from me, I can’t think of a better way than a verifiable link to a bunch of my online presences, saying “this is from that guy you also know as…” Obviously, I want them to add my blog and Mastodon and Bluesky and Google and Apple and my employer and my alma mater and my bank, and then let me choose, per picture, which (if any) of those I want to include in the C2PA. This is very powerful stuff on the provenance front.

Note that the C2PA doesn’t include anything about what kind of device I took the picture on (a Pixel), nor when I took it, but that’d be reasonably straightforward for Google’s camera app to include. I don’t think that information is as important as provenance but I can imagine applications where it’d be interesting.

What did they do to the picture?

The final choice in that export dialog is whether I want to disclose what I did in Lightroom: “Edits and Activity”. Once again, that’s not as interesting as the provenance, but it might be if we wanted to flag AI intervention. And there are already problems in how that data is used; more below.

Anyhow, here’s the picture; I don’t know if it pleases your eye but it does mine.

View down an urban lane towards the setting sun; includes C2PA data

Now, that image just above has been through the ongoing publishing system, which doesn’t know about C2PA, but if you click and enlarge it, the version you get is straight outta Lightroom and retains the C2PA data.

If you want to be sure, install c2patool, and apply it to lane.jpg. Too lazy? No problem, because here’s the JSON output (with the --detailed option). If you’re geeky at all and care about this stuff, you might want to poke around in there.

Another thing you might want to do is download lane.jpg and feed it to the Adobe Content Authenticity Inspect page. Here’s what you get:

Output from the Adobe Content Authenticity “Inspector” service

This is obviously a service that’s early in its life and undoubtedly will get more polish. But still, interesting and useful.

Not perfect

In case it’s not obvious, I’m pretty bullish on C2PA and think it provides us useful weapons against online disinformation and to support trust frameworks. So, yay Adobe, congrats on an excellent start! But, things bother me:

[Update: There used to be a complaint about c2patool here, but its author got in touch with me and pointed out that when you run it and doesn’t complain about validation problems, that means there weren’t any. Very UNIX. Oops.]
Adobe’s Inspector is also available as a Chrome extension. I’m assuming they’ll support more browsers going forward. Assuming a browser extension is actually useful, which isn’t obvious.
The Inspector’s description of what I did in Lightroom doesn’t correspond very well to what the C2PA data says. What I actually did, per the C2PA, was (look for “actions” in the JSON):
1. Opened an existing file named “PXL_20241013_020608588.jpg”.
2. Reduced the exposure by -15.
3. Generated a (non-AI) mask, a linear gradient from the top of the picture down.
4. In the mask, moved the “Shadows” slider to -16.
5. Cropped and straightened the picture (the C2PA doesn’t say how much).
6. Changed the masking again; not sure why this is here because I didn’t do any more editing.
The Inspector output tries to express all this in vague nontechnical English, which loses a lot of information and in one case is just wrong: “Drawing edits: Used tools like pencils, brushes, erasers, or shape, path, or pen tools”. I think that in 2024, anyone who cares enough to look at this stuff knows about cropping and exposure adjustments and so on, they’re ubiquitous everywhere photos are shared.
If I generate C2PA data in an Adobe product, and if I’ve used any of their AI-based tools that either create or remove content, that absolutely should be recorded in the C2PA. Not as an optional extra.
I really, really want Adobe to build a flexible identity framework so you can link to identities via DNS records or .well-known files or OpenID Connect flows, so that I get to pick which identities are included with the C2PA. This, I think, would be huge.
This is not an Adobe problem, but it bothers me that I can’t upload this photo to any of my social-media accounts without losing the C2PA data. It would be a massive win if all the social-media platforms, when you uploaded a photo with C2PA data, preserved it and added more, saying who initially uploaded it. If you know anyone who writes social-media software, please tell them.

Once again, this is progress! Life online with media provenance will be better than the before times.

LLMM 28 Oct 2024, 8:00 pm

The ads are everywhere; on bus shelters and in big-money live-sportscasts and Web interstitials. They say Apple’s products are great because Apple Intelligence and Google’s too because Google Gemini. I think what’s going on here is pretty obvious and a little sad. AI and GG are LLMM: Large Language Mobile Marketing!

It looks like this:

Here are nice factual Wikipedia rundowns on Apple Intelligence and Google Gemini.

The object of the game is to sell devices, and the premise seems to be that people will want to buy them because they’re excited about what AI and GG will do for them. When they arrive, that is, which I guess they’re just now starting to. I guess I’m a little more LLM-skeptical than your average geek, but I read the list of features and thought: Would this sort of thing accelerate my mobile-device-upgrade latency, which at the moment is around three years? Um, no. Anyone’s? Still dubious.

Quite possibly I’m wrong. Maybe there’ll be a wave of influencers raving about how AI/GG improved their sex lives, income, and Buddha-nature, the masses will say “gotta get me some of that” and quarterly sales will soar past everyone’s stretch goals.

What I think happened

I think that the LLMania among the investor/executive class led to a situation where massive engineering muscle was thrown at anything with genAI in its pitch, and when it came time to ship, demanded that that be the white-hot center of the launch marketing.

Because just at the moment, a whole lot of nontechnical people with decision-making power have decided that it’s lethally risky not to bet the farm on a technology they don’t understand. It’s not like it’s the first time it’s happened.

Why it’s sad

First, because the time has long gone when a new mobile-device feature changed everyone’s life. Everything about them is incrementally better every year. When yours wears out, there’ll be a bit of new-shiny feel about onboarding to your new one. But seriously, what proportion of people buy a new phone for any reason other than “the old one wore out”?

This is sad personally for me because I was privileged to be there, an infinitesimally small contributor during the first years of the mobile wave, when many new features felt miraculous. It was a fine time but it’s gone.

The other reason it’s sad is the remorseless logic of financialized capitalism; the revenue number must go up even when the audience isn’t, and major low-hanging unmet needs are increasingly hard to find.

So, the machine creates a new unmet need (for AI/GG) and plasters it on bus shelters and my TV screen. I wish they wouldn’t.

Cursiveness 18 Oct 2024, 9:00 pm

I’ve found relief from current personal stress in an unexpected place: what my mother calls “penmanship”, i.e. cursive writing that is pleasing to the eye and clearly legible. (Wikipedia’s definition of “penmanship” differs, interestingly. Later.) Herewith notes from the handwriting front.

[Oh, that stress: We’re in the final stages of moving into a newly-bought house from the one we bought 27 years ago, and then selling the former place. This is neither easy nor fun. Might be a blog piece in it but first I have to recover.]

My generation

I’m not sure which decade handwriting ceased to matter to schoolchildren; my own kids got a derisory half-term unit. I have unpleasant elementary-school memories of my handwriting being justly criticized, month after month. And then, after decades of pounding a keyboard, it had devolved to the point where I often couldn’t read it myself.

Which I never perceived as much of a problem. I’m a damn fast and accurate typist and for anything that matters, my communication failures aren’t gonna involve letterforms.

I’ve been a little sad that I had become partly illiterate, absent a keyboard and powerful general-purpose computer. But it wasn’t actually a problem. And my inability to decipher my own scribbling occasionally embarrassed me, often while standing in a supermarket aisle. (If your family is as busy as mine, a paper notepad in a central location is an effective and efficient way to build a shopping list.)

Then one night

I was in bed but not asleep and my brain meandered into thoughts of handwriting; then I discovered that the penmanship lessons from elementary school seemed still to be lurking at the back of my brain. So I started mentally handwriting random texts on imaginary paper, seeing if I could recall all those odd cursive linkages. It seemed I could… then I woke up and it was morning. This has continued to work, now for several weeks.

So that’s a quality-of-life win for me: Mental penmanship as a surprisingly strong soporific. Your mileage may vary.

What, you might ask, is the text that I virtually handwrite? Famous poems? Zen koans? The answer is weirder: I turn some switch in a corner of my brain and words that read sort of like newspaper paragraphs come spilling out, making sense but really meaning anything.

Makes me wonder if I have an LLM in my mind.

Dots and crosses

After the occasional bedtime resort to mental cursive, I decided to try the real thing, grabbed the nearest pen-driven tablet, woke up an app that supports pen input, and started a freehand note. I found, pleasingly, that if I held the childhood lessons consciously in focus, I could inscribe an adequately comprehensible hand.

(Not the first attempt.)

Dotting and crossing

There’s a message in the media just above. I discovered that one reason my writing was so terrible was lacking enough patience to revisit the i’s and t’s after finishing a word that contains them, but rather trying to dot and cross as I went along. Enforcing a steely “finish the word, then go back” discipline on myself seems the single most important factor in getting a coherent writing line.

I’ve made the point this blog piece wants to make, but learned a few things about the subject along the way.

Wikipedia?

It says penmanship means simply the practice of inscribing text by hand (cursive is the subclass of penmanship where “characters are written joined in a flowing manner”). But I and the OED both think that English word also also commonly refers to the quality of writing. So I think that entry needs work.

Oh, and “Penmanship” also stands for Tommaso Ciampa the professional wrestler; earlier in his career he fought as “Tommy Penmanship”. I confess I offer this tasty fact just so I could include his picture.

Pop culture?

As I inscribed to-buys on the family grocery list, going back to dot and cross, it occurred to me that “or” was difficult; the writing line leaves the small “o” at the top of the letter, but a small “r” wants to begin on the baseline. I addressed this conundrum, as one does, by visiting YouTube. And thus discovered that a whole lot of people care about this stuff; there are, of course, /r/Cursive and /r/Handwriting.

Which sort of makes sense in a time when LPs and film photography are resurging. I think there are deep things to be thought and (not necessarily hand-)written about the nature of a message inscribed in cursive, even when that cursive is described in pixels. But I’m not going there today. I’m just saying I can read my grocery lists now.

Trollope’s aristos

I distinctly recall reading, in one of Anthony Trollope’s excellent novels about mid-19th-century life, that it was common knowledge that the landed aristocracy heedlessly wrote in incomprehensible chicken-scratches, but that the clerks and scriveners and merchants, the folk lacking genealogy, were expected to have a clear hand.

The new hotness?

I dunno, I don’t really think cursive is, but the idea isn’t crazy.

Voting Green October 19th 15 Oct 2024, 9:00 pm

I’m old enough that I remember voting in the Seventies. I never miss a chance to vote so that’s a lot of elections. In all but one or two my vote has gone to the NDP, Canada’s social democrats. There’s a provincial election Saturday, and I’ll be voting Green, against the current NDP government.

It’s not complicated: I’ve become a nearly-single-issue voter. The fangs of the climate monster are closing on us, and drastic immediate action is called for by all responsible governments to stave them off.

The BC NDP has followed its unlamented right-wing predecessor in making a huge energy bet on fossil fuels, “natural” gas in particular, and especially LNG, optimized for export. “Natural” gas, remember, is basically methane. The fossil-fuels mafia has greenwashed it for years as a “better alternative”, and a “bridge to the renewable future”. Which is a big fat lie; it’s been known for years to be a potent greenhouse gas, and recent research suggests it’s more damaging than coal.

Tilbury

That was the LNG project that made me snap. Here is coverage that tries to be neutral. Tilbury was sold as being a good thing because LNG is said to have a lighter carbon load than the heavy bunker fuel freighters usually burn. Supposing that to be true, well so what: The terminal mostly exists to pump locally-extracted methane to the rest of the world. Check out Tilbury’s first contract, for 53,000 tons of LNG a year off to China, with no indication of what it will be used for and plenty of reason to believe it will end up heating buildings, which instead should be moving to renewable options.

Tilbury is just the latest chapter of the successful march of LNG infrastructure through the minds of successive BC governments; I’ll spare you the long, dispiriting story (but I won’t forget it in the polling booth).

I don’t believe it’s oversimplifying to say that essentially everything the fossil-fuel industry says is a pack of self-serving planet-destroying lies. Why would I vote for a party that apparently believes those lies?

The Carbon Tax

Post-Tilbury, I was probably 60% of the way to splitting with the NDP when they announced they were ready to drop the carbon tax. It is hard to find an economist who does not believe that a carbon tax is one of our sanest and most powerful anti-GHG policy tools. BC has been a North-American leader in instituting and maintaining a carbon tax. So, that sealed the deal. Bye bye, NDP.

What’s happening is simple enough: Canada’s right-wing troglodytes have united around an anti-Carbon-tax platform, chanting “axe the tax”. And our NDP has waved the chickenshit-colored tag of surrender. You can pander to reactionary hypocrites, or you can help us do our bit for the world my children will inherit, but you can’t do both. Bye.

The Greens

Their platform is the kind of sensible social-democratic stuff that I’ve always liked, plus environmentalist to the core. Leader Sonia Furstenau is impressive. It wasn’t a hard choice.

But tactical voting!

It’s been a weird election, with the official opposition party, center-rightists who long formed the government, changing their name (from “Liberals” to “United”) midstream, then collapsing. This led to the emergence of the BC Conservative Party, a long-derided fringe organization famous for laughable candidates, thick with anti-vaxxers, climate deniers, anti-wokesters, anti-LGBTQ ranters, and multiple other flavors of conspiracy connoisseur.

Guess what: That’s what they still are! But much to everyone’s surprise, they’re running pretty close to neck and neck with the NDP.

So people like me can expect to be told that by abandoning the NDP, we’re in effect aiding and abetting the barbarians at the gate. (To be fair, nobody has actually said that to me. The strongest I’ve heard is “it’s your privilege to waste your vote.”)

But what I see is two parties neither of which have any concern for my children’s future, and one which does. If it’s wrong to vote on that basis, I don’t want to be right.

Unbackslash 22 Sep 2024, 9:00 pm

Old software joke: “After the apocalypse, all that’ll be left will be cockroaches, Keith Richards, and markup characters that have been escaped (or unescaped) one too many (or few) times.” I’m working on a programming problem where escaping is a major pain in the ass, specifically “\”. So, for reasons that seem good to me, I want to replace it. What with?

The problem

My Quamina project is all about matching patterns (not going into any further details here, I’ve written this one to death). Recently, I implemented a “wildcard” pattern, that works just like a shell glob, so you can match things like *.xlsx or invoice-*.pdf. The only metacharacter is *, so it has basic escaping, just \* and \\.

It wasn’t hard to write the code, but the unit tests were a freaking nightmare, because \. Specifically, because Quamina’s patterns are wrapped in JSON, which also uses \ for escaping, and I’m coding in Go, which does too, differently for strings delimited by " and `. In the worst case, to test whether \\ was handled properly, I’d have \\\\\\\\ in my test code.

It got to the point that when a test was failing, I had to go into the debugger to figure out what eventually got passed to the library code I was working on. One of the cats jumped up on my keyboard while I was beset with \\\\ and found itself trying to tread air. (It was a short drop onto a soft carpet. But did I ever get glared at.)

Regular expressions ouch

That’s the Quamina feature I’ve just started working on. And as everyone knows, they use \ promiscuously. Dear Reader, I’m going to spare you the “Sickening Regexps I Have Known” war stories. I’m sure you have your own. And I bet they include lots of \’s.

(The particular dialect of regexps I’m writing is I-Regexp.)

I’ve never implemented a regular-expression processor myself, so I expect to find it a bit challenging. And I expect to have really a lot of unit tests. And the prospect of wrangling the \’s in those tests is making me nauseous.

I was telling myself to suck it up when a little voice in the back of my head piped up “But the people who use this library will be writing Go code to generate and test patterns that are JSON-wrapped, so they’re going to suffer just like you are now.”

Crazy idea

So I tried to adopt the worldview of a weary developer trying to unit-test their patterns and simultaneously fighting JSON and Go about what \\ might mean. And I thought “What if I used some other character for escaping in the regexp? One that didn’t have special meanings to multiple layers of software?”

“But that’s crazy” said the other half of my brain. Everyone has been writing things like \S+\.txt and [^{}[\]]+ for years and just thinks that way. Also, the Spanish Inquisition.”

Whatever; like Prince said, let’s go crazy.

The new backslash

We need something that’s visually distinctive, relatively unlikely to appear in common regular expressions, and not too hard for a programmer to enter. Here are some candidates, in no particular order.

For each, we’ll take a simple harmless regexp that matches a pair of parentheses containing no line breaks, like so:

Original: $[^\n\r)]*$

And replace its \‘s with the candidate to see what it looks like:

Left guillemet: «

This is commonly used as open-quotation in non-English languages, in particular French. “Open quotation” has a good semantic feel; after all, \ sort of ”quotes” the following character. It’s visually pretty distinctive. But it’s hard to type on keyboards not located in Europe. Speaking of developers sitting behind those keyboards, they’re more likely to want to use « in a regexp. Hmm.

Sample: «([^«n«r)]*«)

Em dash: —

Speaking of characters used to begin quotes, Em dash seems visually identical to U+2015 QUOTATION DASH, which I’ve often seen as a quotation start in English-language fiction. Em dash is reasonably easy to type, unlikely to appear much in real life. Visually compelling.

Sample: —([^—n—r)]*—)

Left double quotation mark: “

(AKA left smart quote.) So if we like something that suggests an opening quote, why not just use an opening quote? There’s a key combo to generate it on most people’s keyboards. It’s not that likely to appear in developers’ regular expressions. Visually strong enough?

Sample: “([^“n“r)]*“)

Pilcrow: ¶

Usually used to mark a paragraph, so no semantic linkage. But, it’s visually strong (maybe too strong?) and has combos on many keyboards. Unlikely to appear in a regular expression.

Sample: ¶([^¶n¶r)]*¶)

Section sign: §

Once again, visually (maybe too) strong, accessible from many keyboards, not commonly found in regexps.

Sample: §([^§n§r)]*§)

Tilde: ~

Why not? I’ve never seen one in a regexp.

Sample: ~([^~n~r)]*~)

Escaping

Suppose we used tilde to replace backslash. We’d need a way to escape tilde when we wanted it to mean itself. I think just doubling the magic character works fine. So suppose you wanted to match anything beginning with . in my home directory: ~~timbray/~..*

“But wait,” you cry, “why are any of these better than \?” Because there aren’t other layers of software fighting to interpret them as an escape, it’s all yours.

You can vote!

I’m going to run a series of polls on Mastodon. Get yourself an account anywhere in the Fediverse and follow the #unbackslash hashtag. Polls will occur on Friday September 27, in reasonable Pacific times. Of course, one of the options will be “Don’t do this crazy thing, stick with good ol’ \!”

New Amplification 9 Sep 2024, 9:00 pm

The less interesting part of the story is that my big home stereo has new amplification: Tiny Class-D Monoblocks! (Terminology explained below.) More interesting, another audiophile tenet has been holed below the waterline by Moore’s Law. This is a good thing, both for people who just want good sound to be cheaper, and for deranged audiophiles like me.

Tl;dr

This was going to be a short piece, but it got out of control. So, here’s the deal: Audiophiles who love good sound and are willing to throw money at the problem should now throw almost all of it at the pure-analog pieces:

Speakers.
Listening room setup.
Phono cartridge (and maybe turntable) (if you do LPs).

What’s new and different is that amplification technology has joined D-to-A conversion as a domain where small, cheap, semiconductors offer performance that’s close enough to perfect to not matter. The rest of this piece is an overly-long discussion of what amplification is and of the new technology.

The future of amplifiers looks something like this; more below.

What’s an “amp”?

A stereo system can have lots of pieces: Record players, cartridges, DACs, volume and tone controls, input selectors, and speakers. But in every system the last step before the speakers is the “power amplifier”; let’s just say “amp”. Upstream, music is routed round the system, not with electrical currents, but by a voltage signal, we say “line level”. That is to say, the voltage vibrates back and forth, usually between +/-1V, the vibration pattern being that of the music, i.e. that of the sound-wave vibrations you want the speakers to produce in the air between them and your ears.

Now, it takes a lot more than +/-1V to make sound come out of speakers. You need actual electrical current and multiple watts of energy to vibrate the electromagnets in your speakers and generate sound by pushing air around, which will push your eardrums around, which sends data to your brain that results in the experience of pleasure. If you have a big room and not-terribly-efficient speakers and are trying to play a Mahler symphony really loud, it can get into hundreds of watts.

So what an amp does take the line-level voltage signal and turn it into a corresponding electric-current signal with enough oomph behind it to emulate the hundred or so musicians required for that Mahler.

Some speakers (subwoofers, sound bars) come with amps built in, so you just have to send them the line-level signal and they take care of the rest. But in a serious audiophile system, your speakers are typically passive unpowered devices driven through speaker wires by an amp.

Historically, high-end amps have often been large, heavy, expensive, impressive-looking devices. The power can come either from vacuum tubes or “solid-state” circuits (basically, transistors and capacitors). Vacuum tubes are old technology and prone to distortion when driven too hard; electric-guitar amps do this deliberately to produce cool snarly sounds. But there are audiophiles who love tube amps and plenty are sold.

Amps come in pairs, one for each speaker, usually together in a box called a “stereo amplifier”. Sometimes the box also has volume and tone controls and so on, in which case it’s called an “integrated amplifier”.

So, what’s new?

TPA3255

This thing, made by Texas Instruments, is described as a “315-W stereo, 600-W mono, 18 to 53.5V supply, analog input Class-D audio amplifier”. It’s tiny: 14x6.1mm! It sort of blows my mind that this little sliver of semiconductor can serve as the engine for the class of amps that used to weigh 20kg and be the size of a small suitcase. Also that it can deliver hundreds of watts of power without vanishing in a puff of smoke.

Also, it costs less than $20, quantity one.

It’s not that new, was released in 2016. It would be wrong to have expected products built around it to arrive right away. I said above that the chip is the engine of an amplifier, and just like a car, once you have an engine there’s still lots to be built. You have to route the signal and power to the chip — and this particular chip needs a lot of power. You have to route the chip output to the speaker connection, and you have to deal with the fact that speakers’ impedences (impedance is resistance, except for alternating rather than direct current) vary with audio frequency in complicated ways.

Anyhow, to make a long story short, in the last couple of years there have started to be TPA3255-based amps that are aimed at audiophiles, claiming a combination of high power, high accuracy, small size, and low price. And technically-sophisticated reviewers have started to do serious measurements on them and… wow. The results seem to show that the power is as advertised, and that any distortion or nonlinearity is way down below the sensitivity of human hearing. Which is to say, more or less perfect.

For example, check out the work of Archimago, an extremely technical high-end audio blogger, who’s been digging in deep on TPA3255-based amps. If you want to look at a lot of graphs most of which will be incomprehensible unless you’ve got a university education in the subject, check out his reviews of the AIYIMA A08 Pro, Fosi Audio TB10D, and Aoshida A7.

Or, actually, don’t. Below I’ll link to the measurements of the one I bought, and discuss why it’s especially interesting. (Well, maybe do have a quick look, because some of these little beasties come with a charming steampunk aesthetic.)

PWM

That stands for pulse-width modulation, the technique that makes Class-D amps work. It’s remarkably clever. You have the line-level audio input, and you also bring in a triangle-wave signal (straight lines up then back down) at a higher frequency, and you take samples at another higher frequency and if the audio voltage is higher than the sawtooth voltage, you turn the power on, and if lower, you turn it off. So the effect is that the louder the music, the higher the proportion of time the power is on. So you get current output that is shaped like the voltage input, only with lots of little square corners that look like high-frequency noise; an appropriate circuit filters out the high frequencies and reproduces the shape of the input wave with high accuracy.

If that didn’t make sense, here’s a decent YouTube explainer.

The explanation, which my understanding of practical electronics doesn’t go deep enough to validate, is that because the power is only ever on or off, no intermediate states are necessary and the circuit is super efficient therefore cheap.

Monoblocks

Most amps are “stereo amplifiers”, i.e. two amps in a box. They have to solve the problem of keeping the two stereo signals from affecting each other. It turns out the TPA3255 does this right on the chip. So the people who measure and evaluate these devices pay a lot of attention to “channel separation” and “crosstalk”. This has led to high-end audiophiles liking “monoblock” amps, where you have two separate boxes, one for each speaker. Poof! crosstalk is no longer an issue.

Enter Fosi

You may have noticed that you didn’t recognize any of the brand names in that list of reviews above. I didn’t either. This is because mainstream brands from North America, Europe, and Japan are not exactly eager to start replacing their big impressive high-end amps costing thousands of dollars with small, cheap TPA3255-based products at a tenth the price.

Shenzen ain’t got time for that. Near as I can tell, all these outfits shipping little cheap amps are down some back street off a back street in the Shenzen-Guanghzhou megalopolis. One of them is Fosi Audio.

They have a decent web site but are definitely a back-street Shenzen operation. What caught my attention was Archimago’s 2-part review (1, 2) of Fosi’s V3 Mono.

This is a monoblock power amp with some ludicrously high power rating that you can buy as a pair with a shared power supply for a super-reasonable price. They launched with a Kickstarter.

I recommend reading either or both of Archimago’s reviews to feel the flavor of the quantitative-audio approach and also for the general coolness of these products.

I’m stealing one of Archimago’s pictures here, to reveal how insanely small the chip is; it’s the little black/grey rectangle at the middle of the board.

And here is my own pair of V3 Monos to the right of the record player.

Fosi V3 Mono amplifiers beside a Rega turntable

My own experience

My previous amp (an Ayre Acoustics V-5xe) was just fine albeit kinda ugly, but we’re moving to a new place and it’s just not gonna fit into the setup there. I was wrestling with this problem when Archimago published those Fosi write-ups and I was sold, so there they are.

They’re actually a little bit difficult to set up because they’re so small and the power supply is heavier than both amps put together. So I had a little trouble getting all the wires plugged in and arranged. As Archimago suggests, I used the balanced rather than RCA connectors.

Having said all that, once they were set up, they vanished, as in, if it weren’t for the space between the speakers where the old amp used to be, I wouldn’t know the difference. They didn’t cost much. They fit in. They sound good.

One concern: These little suckers get hot when I pump music through them for an extended time. I think I’m going to want to arrange them side-by-side rather than stacked, just to reduce the chances of them cooking themselves.

Also, a mild disappoinment: They have an AUX setting where they turn themselves on when music starts and off again after a few minutes of silence. Works great. But, per Archimago’s measurements, they’re drawing 10 watts in that mode, which seems like way too much to me, and they remain warm to the touch. So, nice feature, but I guess I’ll have to flick their switches from OFF to ON like a savage when I want to listen to music.

The lesson

Maybe you love really good sound. Most of you don’t know because you’ve probably never heard it. I’m totally OK with Sonos or car-audio levels of quality when it’s background music for cooking or cleaning or driving. But sitting down facing a quality high-end system is really a different sort of thing. Not for everyone, but for some people, strongly habit-forming.

If it turns out that if you’re one of those people, it’s now smart to invest all your money in your speakers, and in fiddling with the room where they are to get the best sound out of them. For amplification and the digital parts of the chain, buy cheap close-enough-to-perfect semiconductor products.

And of course, listen to good music. Which, to be fair, is not always that well-produced or well-recorded. But at least the limiting factor won’t be what’s in the room with you.

Standing on High Ground 8 Sep 2024, 9:00 pm

That’s the title of a book coming out October 29th that has my name on the cover. The subtitle is “Civil Disobedience on Burnaby Mountain”. It’s an anthology; I’m both an author and co-editor. The other authors are people who, like me, were arrested resisting the awful “TMX” Trans Mountain pipeline project.

Pulling together a book with 25 contributing authors is a lot of work! One of the contributions started out as a 45-minute phone conversation, transcribed by me. The others manifested in a remarkable melange of styles, structures, and formats.

Which is what makes it fun. Five of our authors are Indigenous people. Another is Elizabeth May, leader of Canada’s Green party. There is a sprinkling of university professors and faith leaders. There are two young Tyrannosauri Rex (no, really). And then there’s me, the Internet geek.

As I wrote then, my brush with the law was very soft; arrested on the very first day of a protest sequence, I got off with a fine. Since fines weren’t stopping the protest, eventually the arrestees started getting jail time. Some of the best writing in the book is the prison narratives, all from people previously unacquainted with the pointy end of our justice system.

Quoting from my own contribution:

Let me break the fourth wall here and speak as a co-editor of the book you are now reading. As I work on the jail-time narratives from other arrestees, alternately graceful, funny, and terrifying, I am consumed with rage at the judicial system. It is apparently content to allow itself to be used as a hammer to beat down resistance to stupid and toxic rent-seeking behaviour, oblivious to issues of the greater good. At no point has anyone in the judiciary looked in the mirror as they jailed yet another group of self-sacrificing people trying to throw themselves between TMX’s engine of destruction and the earth that sustains us, and asked themselves, Are we on the right side here?

Of necessity, the law is constructed of formalisms. But life is constructed on a basis of the oceans and the atmosphere and the mesh of interdependent ecosystems they sustain. At some point, the formalisms need to find the flexibility to favour life, not death. It seems little to ask.

We asked each contributor for a brief bio, a narrative of their experience, and the statement they made to the judge at the time of their sentencing. Our contributors being what they are, sometimes we instead got poems and music-theory disquisitions and discourse on Allodial title. Cartoons too!

Which, once again, is what makes it fun. Well, when it’s not rage-inducing. After all, we lost; they built the pipeline and it’s now doing its bit to worsen the onrushing climate catastrophe, meanwhile endangering Vancouver’s civic waters and shipping economy.

We got endorsements! Lots more on
the Web site and book cover.

The effort was worthwhile, though. There is reason to hope that our work helped raise the political and public-image cost of this kind of bone-stupid anti-survival project to the point that few or no more will ever be built.

Along with transcribing and editing, my contribution to the book included a couple of photos and three maps. Making the maps was massively fun, so I’m going to share them here just because I can. (Warning: These are large images.);

The first appears as a two-page spread, occupying all of the left page and the top third or so of the right.

Then there’s a map of Vancouver and the Lower Mainland, highlighting the locations where much of the book’s action took place.

The Vancouver region, highlighting TMX resistance locations

Finally, here’s a close-up of Burnaby Mountain, where TMX meets the sea, and where most of the arrests happened.

TMX resistance sites around Burnaby Mountain

The credits say “Maps by Tim Bray, based on data from Google Maps, OpenStreetMap, and TMX regulatory filings.”

I suspect that if you’re the kind of person who finds yourself reading this blog from time to time, you’d probably enjoy reading Standing on High Ground. The buy-this-book link is here. If you end up buying a copy — please do — the money will go in part to our publisher Between The Lines, who seem a decent lot and were extremely supportive and competent in getting this job done. The rest gets distributed equally among all the contributors. Each contributor is given the option of declining their share, which makes sense, since some of us are highly privileged and the money wouldn’t make any difference; others can really use the dough.

What’s next?

We’re going to have a launch event sometime this autumn. I’ll announce it here and everywhere else I have a presence. There will be music and food and drink; please come!

What’s really next is the next big harebrained scheme to pad oil companies’ shareholders’ pockets by building destructive infrastructure through irreplaceable wilderness, unceded Indigenous land, and along fragile waterways. Then we’ll have to go out and get arrested again and make it more trouble than it’s worth. It wouldn’t take that many people, and it’d be nice if you were one of them.

I put in years of effort to stop the pipeline. Based on existing laws, I concluded that the pipeline was illegal and presented those arguments to the National Energy Board review panel. When we got to the moment on Burnaby Mountain when the RCMP advanced to read out the injunction to us, I was still acting in the public interest. The true lawbreakers were elsewhere.

[From Elizabeth May’s contribution.]

Thanks!

Chiefly, to our contributors, generous with their words and time, tolerant of our nit-picky editing. From me personally, to my co-editors Rosemary Cornell and Adrienne Drobnies; we didn’t always agree on everything but the considerable work of getting this thing done left nobody with hard feelings. And, as the book’s dedication says, to all those who went out and got arrested to try to convince the powers that be to do the right thing.

I’m going to close with a picture which appears in the book. It shows Kwekwecnewtxw (“Kwe-kwek-new-tukh”), the Watch House built by the Tsleil-Waututh Nation to oversee the enemy’s work, that work also visible in the background. If you want to know what a Watch House is, you’ll need to read the very first contribution in the book, which begins “Jim Leyden is my adopted name—my spirit name is Stehm Mekoch Kanim, which means Blackbear Warrior.”

0 dependencies! 4 Sep 2024, 9:00 pm

Here’s a tiny little done-in-a-couple-hours project consisting of a single static Web page and a cute little badge you can slap on your GitHub project.

The Web site is at 0dependencies.dev. The badge is visible on my current open-source projects, for example check out Topfew (you have to scroll down a bit).

Zero, you say?

In recent months I keep seeing these eruptions of geek angst about the fulminating masses of dependencies squirming under the surface of just about any software anyone uses for anything. The most recent, and what precipitated this, was Mike Perham’s Kill Your Dependencies.

It’s not just that dependencies are a fertile field for CVEs (*cough* xz *cough*) and tech debt, they’re also an enemy of predictable performance.

Also, they’re unavoidable. When you take a dependency, often you’re standing on the shoulders of giants. (Unfortunately, sometimes you’re standing in the shoes of clowns.) Software is accretive and it’s a good thing that that’s OK because it’s also inevitable.

In particular, don’t write your own crypto, etc. Because in software, as in life, you’re gonna have to take some dependencies. But… how about we take less? And how about, sometimes we strive for zero?

The lower you go

… the closer you are to zero. So, suppose you’re writing library code. Consider these criteria:

It’s low-level, might be useful to a lot of apps aimed at entirely different goals.
Good performance is important. Actually, let me revise that: predictably good performance is important.
Security is important.

If you touch all three of these bases, I respectfully suggest that you try to earn this badge: ⓿ ⓿ dependencies! dependencies! (By the way, it’s cool that I can toss a chunk of SVG into my HTML and it Just Works. And, you can click on it.)

How to?

First, whatever programming language you’re in, try to stay within the bounds of what comes with the language. In Go, where I live these days, that means your go.sum file is empty. Good for you!

Second, be aggressive. For example, Go’s JSON support is known to be kind of slow and memory-hungry. That’s OK because there are better open-source options. For Quamina, I rejected the alternatives and wrote my own JSON parser for the hot code path. Which, to be honest, is a pretty low bar: JSON’s grammar could be inscribed on a grain of rice, or you can rely on Doug Crockford’s JSON.org.

So, get your dependencies to zero and display the badge proudly. Or if you can’t, think about each of your dependencies. Does each of them add enough value, compared to you writing the code yourself? In particular, taking a dependency on a huge general-purpose library for one small simple function is an antipattern.

What are you going to do, Tim?

I’m not trying to start a movement or anything. I just made a badge, a one-page website, and a blog post.

If I were fanatically dedicated, 0dependencies.dev would be database-backed with a React front-end and multiple Kubernetes pods, to track bearers of the badge. Uh, no.

But, I’ll keep my eyes open. And if any particularly visible projects that you know about want to claim the badge, let me know and maybe I’ll start a 0dependency hall of fame.

Long Links 2 Sep 2024, 9:00 pm

It’s been a while. Between 2020 and mid-2023, I wrote pretty regular “Long Links” posts, curating links to long-form pieces that I thought were good and I had time to read all of because, unlike my readers, I was lightly employed. Well, then along came my Uncle Sam gig, then fun Open Source with Topfew and Quamina, then personal turmoil, and I’ve got really a lot of browser tabs that I thought I’d share one day. That day is today.

Which is to say that some of these are pretty old. But still worth a look I think.

True North Indexed

Let’s start with Canadian stuff; how about a poem? No, really, check out Emergency Exit, by Kayla Czaga; touched me and made me smile.

Then there’s Canada Modern, from which comes the title of this section. It’s an endless scroll of 20^th-century Canadian design statements. Go take it for a spin, it’s gentle wholesome stuff

Renaissance prof

uses this has had a pretty good run since 2009; I quote: “Uses This is a collection of nerdy interviews asking people from all walks of life what they use to get the job done.” Older readers may find that my own May 2010 appearance offers a nostalgic glow.

Anyhow, a recent entry covers “Robert W Gehl, Professor (Communication and Media Studies)”, and reading it fills me with envy at Prof. Gehl’s ability to get along on the most pristine free-software diet imaginable. I mean, I know the answer: I’m addicted to Adobe graphics software and to Apple’s Keynote. No, wait, I don’t give that many conference talks any more and when I do, I rely on preloaded set of browser tabs that my audience can visit and follow along.

If it weren’t for that damn photo-editing software. Anyhow, major hat-tip in Prof. Gehl’s direction. Some of you should try to be more like him. I should too.

Now for some tech culture.

Consensus

The IETF does most of the work of nailing down the design of the Internet in sufficient detail that programmers can read the design docs and write code that interoperates. It’s all done without voting, by consensus. Consensus, you say? What does that mean? Mark Nottingham (Mnot for short) has the details. Consensus in Internet Standards doesn’t limit its discussion to the IETF. You probably don’t need to know this unless you’re planning to join a standards committee (in which case you really do) but I think many people would be interested in how Internet-standards morlocks work.

More Mnot

Check out his Centralization, Decentralization, and Internet Standards The Internet’s design is radically decentralized. Contemporary late-capitalist business structures are inherently centralized. I know which I prefer. But the tension won’t go away, and Mnot goes way deep on the nature of the problem and what we might be able to do it.

For what it’s worth, I think “The Fediverse” is a good answer to several of Mnot’s questions.

More IETF

From last year, Reflections on Ten Years Past the Snowden Revelations is a solid piece of work. Ed Snowden changed the Internet, made it safer for everyone, by giving us a picture of what adversaries did and knew. It took a lot of work. I hope Snowden gets to come home someday.

Polling Palestinians

We hear lots of stern-toned denunciations of the Middle East’s murderers — both flavors, Zionist and Palestinian — and quite a variety of voices from inside Israel. But the only Palestinians who get quoted are officials from Hamas or the PLA; neither organization has earned the privilege of your attention. So why not go out and use modern polling methodology to find out what actual Palestinians think? The project got a write-up in the New Yorker: What It Takes to Give Palestinians a Voice. And then here’s the actual poll, conducted by the “Palestinian Center for Policy and Survey Research”, of which I know nothing. Raw random data about one of the world’s hardest problems.

Music rage

Like music? Feel like a blast of pure white-hot cleansing rage? Got what you need: Same Old Song: Private Equity Is Destroying Our Music Ecosystem. I mean, stories whose titles begin “Private equity is destroying…” are getting into “There was a Tuesday in last week” territory. But this one hit me particularly hard. I mean, take the ship up and nuke the site from orbit. It’s the only way to be sure.

Movies too

Existentially threatened by late capitalism, I mean. Hollywood’s Slo-Mo Self-Sabotage has the organizational details about how the biz is eating its seed corn in the name of “efficiency”.

I’m increasingly convinced that the whole notion of streaming is irremediably broken; these articles speak to the specifics and if they’re right, we may get to try out new approaches after the streamers self-immolate.

A target for luck

I’ve mostly not been a fan of Paul Graham. Like many, I was impressed by his early essays, then saddened as he veered into a conventional right-wing flavor that was reactionary, boring, and mostly wrong. So these days, I hesitate to recommend his writing. Having said that, here’s an outtake from How To Do Great Work:

When you read biographies of people who've done great work, it's remarkable how much luck is involved. They discover what to work on as a result of a chance meeting, or by reading a book they happen to pick up. So you need to make yourself a big target for luck, and the way to do that is to be curious. Try lots of things, meet lots of people, read lots of books, ask lots of questions.

Amen. And the humility — recognition that good outcomes need more than brains and energy — is not exactly typical of the Bay-Aryan elite, and is welcome. And there’s other thought-provoking stuff in there too, but the tone will put many off; the wisdom is dispensed with an entire absence of humility, or really any supporting evidence. And that title is a little cringey. Could have been shorter, too.

“readable, writerly web layouts”

Jeffrey Zeldman asks who will design them. It’s mostly a list of links to plausible candidates for that design role. Year-old links, now, too. But still worth grazing on if you care about this stuff, which most of us probably should.

Speaking of which, consider heather buchel’s Just normal web things. I suspect that basically 100% of the people who find their way here will be muttering FUCK YEAH! at every paragraph.

Enshittification stanzas

(My oldest tabs, I think.) I’m talking about Ellis Hamburger’s Social media is doomed to die and Cat Valente’s Stop Talking to Each Other and Start Buying Things: Three Decades of Survival in the Desert of Social Media, say many of the same things that Cory is. But with more personal from-the-inside flavor. And not without streaks of optimism.

Billionaires

It’s amazing how fast this word has become shorthand for the problem that an increasing number of people believe is at the center of the most important social pathologies: The absurd level of inequality that has has grown tumorously under modern capitalism. American billionaires are a policy failure doesn’t really focus on the injustice, but rather does the numbers, presenting a compelling argument that a society having billionaires yields little to no benefit to that society, and precious little to the billionaires. It’s sobering, enlightening, stuff.

Gotta talk about AI I guess

The “T” in GPT stands for “Transformation”. From Was Linguistic A.I. Created by Accident? comes this quote:

It’s fitting that the architecture outlined in “Attention Is All You Need” is called the transformer only because Uszkoreit liked the sound of that word. (“I never really understood the name,” Gomez told me. “It sounds cool, though.”)

Which is to say, this piece casts an interesting sidelight on the LLM origin story, starting in the spring of 2017. If you’ve put any study into the field this probably won’t teach you anything you don’t know. But I knew relatively little of this early history.

Visual falsehood

Everyone who’s taken a serious look at the intersection of AI and photography offered by the Pixel 9 has reacted intensely. The terms applied have ranged from “cool” to “terrifying”. I particularly like Sarah Jeong’s No one’s ready for this, from which a few soundbites:
“These photographs are extraordinarily convincing, and they are all extremely fucking fake.”
“…the easiest, breeziest user interface for top-tier lies…”
“…the default assumption about a photo is about to become that it’s faked…”
“A photo, in this world, stops being a supplement to fallible human recollection, but instead a mirror of it.”
We are fucked.

And that’s just the words; the picture accompanying the article are a stomach-churning stanza of visual lies.

Fortunately, I’m not convinced we’re fucked. But Google needs to get its shit together and force this AI voodoo to leave tracks, be transparent, disclose what it’s doing. We’re starting to have the tools, in particular a thing called C2PA on which I’ve had plenty to say.

Specifically, what Google needs to do is, when someone applies an AI technique to produce an image of something that didn’t happen, write a notification that this is the case into the picture’s EXIF and include that in the C2PA-signed manifest. And help create a culture where anything that doesn’t have a verifiable C2PA-signed provenance trail should be presumed a lie and neither forwarded nor reposted nor otherwise allowed to continue on its lying path.

Fade out

Here’s some beautifully performed and recorded music that has melody and integrity and grace: The Raconteurs feat. Ricky Skaggs and Ashley Monroe - Old Enough.

I wish things were a little less hectic. Because I miss having the time for Long Links.

Let’s all do the best we can with what we have.

Q Numbers Redux Explained 31 Aug 2024, 9:00 pm

[The first guest post in some years. Welcome, Arne Hormann!]

Hi there, I'm Arne. In July, I stumbled on a lobste.rs link (thanks, Carlana) that led me to Tim’s article on Q Numbers. As I'm regularly working with both floating point numbers and Go, the post made me curious — I was sure I could improve the compression scheme. I posted comments. Tim liked my idea and challenged me to create a PR. I did the encoding but felt overwhelmed with the integration. Tim took over and merged it. And since v1.4.0 released in August 28th, it's in Quamina.

In Tim’s post about that change, he wrote “Arne explained to me how it works in some chat that I can’t find, and to be honest I can’t quite look at this and remember the explanation”. Well, you're reading it right now.

Float64

Let's first talk about the data we are dealing with here. Quamina operates on JSON. While no JSON specification limits numbers to IEEE 754 floating point (Go calls them float64), RFC 8259 recommends treating them as such.

They look like this:

1 bit sign; 0 is positive, 1 is negative.
11 bits exponent; It uses a bias of 1023 ((1<<10)-1), and all exponent bits set means Infinity or Not a Number (NaN)
1 mantissa high bit (implicit, never stored: 0 if all exponent bits are 0, otherwise 1). This is required so there's exactly one way to store each number.
52 explicit mantissa bits. The 53 mantissa bits are the binary digits of the number itself.

Both 0 and -0 exist and are equal but are represented differently. We have to normalize -0 to 0 for comparability, but according to Tim’s tests, -0 cannot occur in Quamina because JSON decoding silently converts -0 to 0.

With an exponent with all bits set, Infinity has a mantissa of 0. Any other mantissa is NaN. But both sign values are used. There are a lot of different NaNs; 1<<53 - 2 different ones!

In JSON, Infinity and NaN are not representable as numbers, so we don't have to concern ourselves with them.

Finally, keep in mind that these are binary numbers, not decimals. Decimal 0.1 cannot be accurately encoded. But 0.5 can. And all integers up to 1<<53, too.

Adding numbers to Quamina

Quamina operates on UTF-8 strings and compares them byte by byte. To add numbers to it, they have to be bytewise-comparable and all bytes have to be valid in UTF-8.

Given these constraints, let's consider the problem of comparability, first.

Sortable bits

We can use math.Float64bits() to convert a float64 into its individual bits (stored as uint64).

Positive numbers are already perfectly sorted. But they are smaller than negative numbers. To fix that, we always flip the sign bit so it's 1 for positive and 0 for negative values.

Negative values are sorted exactly the wrong way around. To totally reverse their order, we have to flip all exponent and mantissa bits in addition to the sign bits.

A simple implementation would look like:


func numbitsFromFloat64(f float64) numbits {
	u := math.Float64bits(f)
	if f < 0 {
		return numbits(^u)
	}
	return numbits(u) | (1 << 63)
}

Now let's look at the actual code:


func numbitsFromFloat64(f float64) numbits {
	u := math.Float64bits(f)
	mask := (u>>63)*^uint64(0) | (1 << 63)
	return numbits(u ^ mask)
}

The mask line can be a bit of a headscratcher...

u>>63 moves the sign bit of the number to the lowest bit
the result will be 0 for positive values and 1 for negative values
that's multiplied with ^0 - all 64 bits set to true
we now have 0 for positive numbers and all bits set for negative numbers
regardless, always set the sign bit with | (1 << 63)

By xoring mask to the original bits, we get our transformation for lexically ordered numbers.

The code is a bit hard to parse because it avoids branches for performance reasons.

Sortable bytes

If we were to use the uint64 now, it would work out great. But it has to be split into its individual bytes. And on little endian systems - most of the frequently used ones - the byte order will be wrong and has to be reversed. That's done by storing the bytes in big endian encoding.

UTF-8 bytes

Our next problem is the required UTF-8 encoding. Axel suggested to only use the lower 7 bits. That means we need up to 10 byte instead of 8 (7*10 = 70 fits 8*8 = 64).

Compress

The final insight was that trailing 0 bytes do not influence the comparison at all and can be dropped. Which compresses positive power-of-two numbers and integers even further. Not negative values, though - the required bit inversion messes it up.

And that's all on this topic concerning Quamina — even if there's so, so much more about floats.

Q Numbers Redux 28 Aug 2024, 9:00 pm

Back in July I wrote about Q numbers, which make it possible to compare numeric values using a finite automaton. It represented a subset of numbers as 14-hex-digit strings. In a remarkable instance of BDD (Blog-Driven Development, obviously) Arne Hormann and Axel Wagner figured out a way to represent all 64-bit floats in at most ten bytes of UTF-8 and often fewer. This feels nearly miraculous to me; read on for heroic bit-twiddling.

Numbits

Arne Hormann worked out how to rearrange the sign, exponent and mantissa that make up a float’s 64 bits into a big-endian integer that you probably couldn’t do math with but you can compare for equality and ordering. Turn that into sixteen hex digits and you’ve got automaton fuel which covers all the floats at the cost of being a little bigger.

If you want to admire Arne’s awesome bit-twiddling skills, look at numbits.go. He explained to me how it works in some chat that I can’t find, and to be honest I can’t quite look at this and remember the explanation.

    u := math.Float64bits(f)
    // transform without branching
    // if high bit is 0, xor with sign bit 1 << 63,
    // else negate (xor with ^0)
    mask := (u>>63)*^uint64(0) | (1 << 63)
    return numbits(u ^ mask)

[Update: Arne wrote it up! See Q Numbers Redux Explained.]

Even when I was puzzled, I wasn’t worried because the unit tests are good; it works.

Arne called these “numbits” and wrote a nice complete API for them, although Quamina just needs .fromFloat64() and .toUTF8(). I and Arne both thought he’d invented this, but then he discovered that the same trick was being used in the DB2 on-disk data format years and years ago. Still, damn clever, and I’ve urged him to make numbits into a standalone library.

We want less!

We care about size; Among other things, the time an automaton takes to match a value is linear (sometimes worse) in its length. So the growth from 14 to 16 bytes made us unhappy. But, no problemo! Axel Wagner pointed out that if you use base-128, you can squeeze those 64 bits into ten usable bytes of UTF-8. So now we’re shorter than the previous iteration of Q numbers while handling all the float64 values…

But wait, there’s more! Arne noticed that for purposes of equality and comparison, trailing zeroes (0x0, not ‘0’) in those 10-byte strings are entirely insignificant and can just be discarded. The final digit only has 1/128 chance of being zero, so maybe no big deal. But it turns out that you do get dramatic trailing-0 sequences in positive integers, especially small ones, which in my experience are the kinds of numbers you most often want to match. Here’s a chart of the length of the lengths the of numbits-based Q numbers for the integers zero through 100,000 inclusive.

Length	Count
1	1
2	1
3	115
4	7590
5	92294

They’re all shorter than 5 until you get to 1,000.

Unfortunately, none of my benchmarks prove any performance increase because they focus on corner cases and extreme numbers; the benefits here are to the world’s most boring numbers, namely small non-negative integers.

Here I am, well past retirement age, still getting my jollies from open-source bit-banging. I hope other people manage to preserve their professional passions into later life.

Mozart Requiem 25 Aug 2024, 9:00 pm

Vancouver has many choirs, with differing proficiency levels and repertoire choices. Most gather fall-to-spring and take the summer off. Thus, Summerchor, which aggregates a couple of hundred singers from many choirs to tackle one of the Really Big choral pieces each August. This year it was the Mozart Requiem. Mozart died while writing this work and there are many “completions” by other composers. Consider just the Modern completions; this performance was of Robert Levin’s.

Mozart’s Requiem performed in St. Andrews-Wesley

Summerchor performs Mozart’s Requiem in
St.Andrews-Wesley United Church, August 24, 2024.

The 200 singers were assisted by four soloists, a piano, a trombone, the church’s excellent pipe organ, and finally, of course, by the towering arched space.

The combined power of Mozart’s music, the force of massed voices, and the loveliness of the great room yielded a entirely overwhelming torrent of beauty. I felt like my whole body was being squeezed.

Obviously, in an hour-long work, some parts are stronger than others. For me, the opening Requiem Aeternum and Kyrie hit hard, then the absolutely wonderful ascending line opening the Domine Jesu totally crushed me. But for every one of the three-thousand-plus seconds, I was left in no doubt that I was experiencing about as much beauty as a human being can.

God?

We were in a house of worship. The words we were listening to were liturgical, excerpted from Scripture. What, then, of the fact that I am actively hostile to religion? Yeah, I freely acknowledge that all this beauty I’m soaking up is founded on faith. Other people’s faith, of other times. The proportion of people who profess that (or any) faith is monotonically declining, maybe not everywhere, but certainly where I live.

Should I feel sad about that? Not really; The fact that architects and musicians worked for the Church is related to the fact that the Church was willing to pay. Musicians can manage without God, generally, as long as they’re getting paid.

The sound

We’ve been going out to concerts quite a lot recently so of course I’ve been writing about it, and usually discussing the sound quality too.

The sound at the Requiem was beyond awesome. If you look at the picture above you can see there’s a soundboard and a guy sitting at it, but I’m pretty sure the only boost was on the piano, which had to compete with 200 singers and the organ. So, this was the usual classical-music scenario: If you want dynamic range, or to hear soloists, or to blend parts, you do that with musical skill and human throats and fingers and a whole lot of practice. There’s no knob to twirl.

I mean, I love well-executed electric sound, but large-scale classical, done well, is on a whole other level.

Above, I mentioned the rising line at the opening of the Domine Jesu; the pulses, and the space between them, rose up in the endless vertical space as they rose up the scale, and yet were clearly clipped at start and end, because you don’t get that much reverb when the pews are full of soft human flesh and the roof is made of old wood, no matter how big the church is. I just don’t have words for how wonderful it sounded.

Classical?

Obviously, this is what is conventionally called “classical” music. But I’m getting a little less comfortable with that term, especially the connotation that it’s “music for old people” (even though I am one of those). Because so is rootsy rock music and bluegrass and Americana and GoGo Pengin and Guns N’ Roses.

2024 Pollscrolling 20 Aug 2024, 9:00 pm

The 2024 US election has, in the last few weeks, become the most interesting one I can recall. I’m pretty old, so that’s a strong statement; I can recall a lot of US elections. The Internet makes it way too easy to obsess over a story that’s this big and has this many people sharing opinions. Here is my opinion, not on who’s winning, but on how, with only a very moderate expenditure of time and money, you can be as well-informed as anybody in the world as to how it’s going.

Disclosures: I’m not American, got no vote on this one. Am left of the US Democratic party, but at this point they represent an infinitely better option for America than does anything connected to Donald Trump. Thus, the following remarks should be assumed to have a strong Harris/Walz bias.

I claim that using the following sites should, in 5-15 minutes a day, give you an understanding of the state of the race that is very close to that of the big-name prognosticators writing in the big-name publications.

Polls generally

They’ve been wrong a lot recently. The industry is still trying to complete the transition off landlines, which few people have any more; those who still do are an entirely unrepresentative sample. I’ve seen a few smart people guessing that pollers may be getting better, this cycle. We’ll see, won’t we?

Prognosticators generally

They suck. The arm-waving, gut-feel bullshit, and undisclosed bias is disgusting. In the pages of the really big properties like the NYT and WaPo, the opinion columnists are mostly partisan hacks looking for reasons to explain why their favored party is doing just fine and the other is failing. I’ve pretty well given up reading them.

OK, now let’s get to our sources, in no particular order.

Talking Points Memo

TPM is the home of Josh Marshall, the ur-blogger on US Politics, and still as good as anyone. He’s hired a bunch of other clear and clear-eyed writers, who obsess about elections all day every day. They have a strong and acknowledged pro-Democratic and anti-Trump bias, but in my opinion don’t let it clutter their analyses. If you’re reading the polling sites that I’m recommending below, TPM will have smart interpretive pieces about what they might mean.

A lot of their stuff is free, but the best isn’t. Unfortunately, they only offer annual subscriptions at US$70; I’ve subscribed for many years. It depends on how fierce the election monkey on your back is, but in my personal experience, there is no other site that offers more depth and clarity on US politics. Maybe worth your money for one year, this year.

Nate Silver

The former 538 guy is now at the Silver Bulletin. Nate is not everyone’s cup of tea, but he retained the IP rights to the 538 election model, which is updated on a daily basis. He admits to being moderately pro-Harris this time around but argues convincingly that he doesn’t let this cloud his analysis, which I generally find pretty cogent. You probably need a little bit of statistical literacy to fully appreciate this stuff.

As of today, August 20, 2024, his probability-of-win numbers are Harris 53.6%, Trump 45.7%.

The Silver Bulletin has some free stuff, but the best parts, including the model updates, are paywalled. The price is currently $14/month, but on September 1st they’re going up to $20, just until the election is over. Because a lot of people like me signed up with every expectation of unsubscribing in three months.

If you happen to care about professional sports there’s lots of brain candy there too.

538

Their Latest Polls page is now part of ABC news but seems to retain many of the virtues that made 538 the flavor-of-the-month a few years ago.

The crucial thing, if you’re visiting once a day, is to find the “Sort by Date” widget and click on “Added”, which brings the most recent stuff that you probably haven’t seen yet to the top.

As I write this, their national polling average has Harris 46.6%, Trump 43.8%. This is quite different from Silver’s “probability of winning”. 538’s main virtue is that they get the polls up about as fast as anyone else. It’s free.

RCP

I kind of hate to mention RealClearPolitics because it is at least in part a hive of filthy MAGA-friendly screamers. Their front page is all links, MAGA-dominated but including a sprinkling of analysis from more even-handed or openly-progressive sources. Anyhow, the problem with that stuff isn’t the bias, it’s the fact that they’re just a bunch of low-value prognosticators. I wouldn’t waste much time on the front page, I’d start by clicking on “Polls”, near the top left.

Their aggregation of poll results will contain about what you’ll see at 538 (sometimes important polls get to one place first, sometimes the other). But RCP offers other useful resources. There is the RCP Pollster Scorecard, which offers data on the accuracy and bias of most of the pollers whose results they report. Since some of those pollers are super extra biased, this can be a useful sanity check.

What I really like is the Electoral College Map, which as I write is predicting a Trump victory, 287-251 in EC votes. You can click on each state and see the polls they used to compute their prediction for that state.

I think their MAGA bias shows in the predictions, but that’s OK, because there’s a “Create Your Own Map” link, where you can disagree with them and explore each side’s path to victory or defeat. Looking at today’s map, my conclusion is that if Harris can flip Pennsylvania from red to blue she probably wins, and if she can bring along either or both of Arizona and North Carolina, Trump is roadkill.

CNN

No, really. It’s cheesy and overhyped but feels to me like it’s speaking to a pretty big constituency that I don’t know anybody from, there’s a bit of zeitgeist in the flow. To my eye it leans a little more Dem than GOP (that’s a surprise) and is not actually terrible.

When?

My advice is to wait until at least mid-afternoon, when the polls of the day have been published and ingested, then put in your pollscrolling time. Won’t take too long and you’ll know what the allegedly-smart people know.

Basic Infrastructure 13 Aug 2024, 9:00 pm

Recently, I was looking at the infrastructure bills for our CoSocial co-op member-owned Mastodon instance, mostly Digital Ocean and a bit of AWS. They seemed too high for what we’re getting. Which makes me think about the kind of infrastructure that a decentralized social network needs, and how to get it.

I worked at AWS for 5½ years and part of my job was explaining why public-cloud infrastructure is a good idea. I had no trouble doing that because, for the people who are using it, it was (and is) a good idea. The public cloud offers a quality of service, measured by performance, security, and durability, that most customers couldn’t build by themselves. One way to put it is this: If you experience problems in those areas, they are much more likely to be problems in your software than in the cloud infrastructure.

Of course, providing this level of service costs billions in capex and salaries for thousands of expensive senior engineers. So you can expect your monthly cloud-services bill to be substantial.

But what if…

What if you don’t need that quality of service? What if an hour of downtime now and then was an irritant but not an existential problem? What if you were OK with occasionally needing to restore data from backup? What if everything on your server was public data and not interesting to bad actors?

Put another way, what if you were running a small-to-medium Fediverse instance?

If it goes offline occasionally, nobody’s life is damaged much. And, while I grant that this is not well-understood, at this point in time everything on Fedi should be considered public, and I don’t think that’ll change even when we get end-to-end encryption because that data of course isn’t plain text. Here is what you care about:

Members’ posts don’t get permanently lost.
You don’t want bad people hijacking your members’ accounts and posting damaging stuff.
You don’t want to provision and monitor a relational database.

“Basic”?

So, what I want for decentralized social media is computers and storage “in the cloud”, as in I don’t want to have to visit them physically. But I don’t need them to be very fast or to be any more reliable than modern server and disk hardware generally are. I do need some sort of effective backup/restore facility, and I want good solid modern authentication.

And, of course, I want this to be a whole lot cheaper than the “enterprise”-facing public cloud. Because I’m not an enterprise.

(I think I still need a CDN. But that’s OK because they’re commoditized and competitive these days.)

I know this is achievable. What I don’t know is who might want to offer this kind of infrastructure. I think some of it is already out there, but you have to be pretty savvy about knowing who the vendors are and their product ranges and strengths and weaknesses.

Maybe we don’t need any new products, just a new product category, so people like me know which products to look at.

How about “Basic Infrastructure”?

Countrywomen 11 Aug 2024, 9:00 pm

In the last couple of weeks I’ve been at shows by Molly Tuttle and Sierra Ferrell (I recommend clicking both those links just for the front-page portraits). Herewith thoughts on the genres, performances, and sound quality.

Tuttle is post-bluegrass and Ferrell is, um, well, Wikipedia says “folk, bluegrass, gypsy jazz, and Latin styles” which, OK, I guess, but it doesn’t mention pure old-fashioned country, her strongest flavor. These days, “Americana” is used to describe both these artists. The notion that Americana implies “by white people” is just wrong, check out Rhiannon Giddens’ charming video on the origin of the banjo. (Ms Giddens is a goddess; if you don’t know about her, check her out.)

Both bands (for brevity, just Molly and Sierra) feature mandolin, fiddle, stand-up bass, and acoustic guitar. Molly adds banjo, Sierra drums and occasional electric guitar. Both offer flashy instrumental displays; Molly adds big group meltdowns, veering into jam-band territory. Both women sing divinely and the bands regularly contribute lovely multi-part harmonies.

I think that Americana just now is one of the most interesting living musical directions. These artists are young, are standing on firm foundations, and are pushing into new territory. Judging by the crowds these days I’m not alone, so for those who agree, I’ll offer a few words on each performance.

Molly Tuttle and Golden Highway at the Hollywood Theatre

Interestingly, this is the same venue where I first saw Sierra, back in March of 2022. It’s intimate and nice-looking and has decent sound.

The crowd was pretty grey-haired; the previous week we’d taken in an Early Music Vancouver concert dedicated to Gabrieli (1557-1612) and the age demographic wasn’t that different, except for Molly’s fans wear jeans and leather and, frequently, hippie accoutrements. It dawns on me that bluegrass is in some respects a “classical” genre; It has lots of rules and formalisms and an absolute insistence on virtuosic skill.

She played a generous selection of favorites (El Dorado, Dooley’s Farm, Crooked Tree) and exceptionally tasty covers (Dire Wolf, She’s a Rainbow). The band was awesomely tight and Molly was in fine form.

In most pictures of Molly she has hair, but during her concerts she usually tells the story of how as a young child she had Alopecia universalis (total all-body hair loss) and, particularly if the concert venue is warm, whips off her wig. At this show she talked about how, on behalf of a support organization, she’d visited a little Vancouver girl with Alopecia, and how sad she was that the kid couldn’t come to the show since the Hollywood is also a bar. It was touching; good on her.

Molly, a fine singer and songwriter, is also a virtuoso bluegrass guitar flat-picker and her band are all right up there, so the playing on balance was probably a little finer than Sierra’s posse offered. And as I mentioned, they do the occasional jam-band rave-up, which I really enjoyed.

But their sound guy needs to be fired. I was at the show alone and thus found a corner to prop myself up that happened to be right behind this bozo’s desk. He had a couple of devices that I didn’t recognize, with plenty of sliders, physical and on-screen, and he was hard at work from end to end “enhancing” the sound. He threw oceans of echo on Molly’s voice then yanked it out, injected big rumble on song climaxes, brightened up the banjo and mandolin so they sounded like someone driving nails into metal, and slammed the balance back and forth to create fake stereo when licks were being traded. This sort of worked when they were doing the extended-jam thing, but damaged every song that relied on sonic truth or subtlety, which was most of them. Feaugh. Concert sound people should get out of the fucking way and reproduce what the musicians are playing. I guess Molly must like this or she wouldn’t have hired him? I wish she could come out and hear what it sounds like though.

Anyhow, it’s a good band that plays good songs with astonishing skill. If you’re open to this kind of music you’d enjoy their show.

The last encore was Helpless. I’m not 100% sure that Molly knew what she was in for. Every grey-haired Canadian knows that tune and every word of its lyrics. So as soon as she was three words in, the whole audience was booming along heartily, having a fine time. Quite a few grizzled cheeks were wet with tears, but I thought Molly looked a little taken aback. She went with it, and it was lovely.

Sierra Ferrell at the Orpheum

This hall is one of Vancouver’s two big venues where the symphony plays, operas are presented, and so on. It opened in 1927 and the decor is lavish, tastefully over-the-top, but ignore the execrable ceiling art.

Sierra Ferrell, singing Whispering Waltz.

On this picture, my usually-trusty Pixel 7 failed me. The focus is unacceptably bad but I’m running it anyhow to share Sierra’s outfit, which is as always fabulous. She kicked up her heels once or twice, revealing big tall Barbie-pink boots under that dress.

The audience had plenty of greybeards but on balance was way younger than Molly’s, with a high proportion of women dressed to the nines in Western-wear finery and some of the prettiest dresses I’ve seen in years. It was really a lot of fun just to look around and enjoy the shapes that Sierra’s influence takes.

Sierra is a wonderful singer but those songs, wow, I’m sure some of them will be loved and shared long after I and she are in the grave. Her set didn’t leave out any of the favorites. There were a few covers, notably Me and Bobby McGee, which was heartbreaking and then rousing. Before starting Sierra acknowledged her debt to Janis Joplin, whom I never saw, but I felt Janis there in spirit.

Everybody is going to have a few favorites among her songs. The three-song sequence, Lighthouse, The Sea, and Far Away Across the Sea, was so beautiful it left me feeling emptied. They turned Far Away into a rocker with a bit of extended jamming and it was just wonderful.

But the thing about a Sierra Ferrell show isn’t just the songs or the singing or the playing, it’s her million watts of charisma, and the connection with the crowd. People kept bringing her floral garlands and, after “Garden”, someone ran up to the stage with a little potted plant. There are some people who, when they get up on the stage, you just can’t take your eyes off them, and she’s one of those. I’m pretty confident that if she keeps holding it together and writing those songs, she’s headed for Dolly Parton territory in terms of fame and fortune.

Any complaints? Yes, this was the first stop on a new tour and the sound was initially pretty rough, but they got it fixed up so that’s forgivable. There’s still a problem: When Sierra leans into a really big note she overloads whatever mike they’re using; not sure what the cure is for that.

Another gripe: Sierra used to have a part of the set where the band gathered around an old-school radio mike with acoustic instruments and played in a very traditional style. I think she shouldn’t leave that out.

Finally, one more problem: Vancouver loves Sierra just a little too much. Every little vocal flourish, every cool little instrumental break, every one of those got a huge roar of approval from the crowd, which, fine, but some of those songs take the level way down and then back up again in a very artful way, and I wished the crowd would shut up, let Sierra drive, and clap at the end of the song.

Americana

Like I said, this is where some of the most interesting living artists are digging in and doing great work. Highly recommended.

Invisible Attackers 30 Jul 2024, 9:00 pm

In the last few days we’ve had an outburst of painful, intelligent, useful conversation about racism and abuse in the world of Mastodon and the Fediverse. I certainly learned things I hadn’t known, and I’m going to walk you through the recent drama and toss in ideas on how to improve safety.

For me, the story started back in early 2023 when Timnit Gebru (the person fired by Google for questioning the LLM-is-great orthodoxy, co-author of Stochastic Parrots) shouted loudly and eloquently that her arrival on Mastodon was greeted by a volley of racist abuse. This shocked a lot of hyperoverprivileged people like me who don’t experience that stuff. As the months went on after that, my perception was that the Mastodon community had pulled up its moderation socks and things were getting better.

July 2024

Then, just this week, Kim Crayton issued a passionate invitation to the “White Dudes for Kamala Harris” event, followed immediately by examples of the racist trolling that she saw in response. With a content warning that this is not pretty stuff, here are two of her posts: 1, 2.

Let me quote Ms Crayton:

The racist attacks you’ve witnessed directed at me since Friday, particularly by instances with 1 or 2 individuals, SHOULD cause you to ask “why?” and here’s the part “good white folx” often miss…these attacks are about YOU…these attacks are INTENDED to keep me from putting a mirror in your faces and showing you that YOU TOO are harmed by white supremacy and anti-Blackness…these attacks are no different than banning books…they’re INTENDED to keep you IGNORANT about the fact that you’re COMPLICIT

She quite appropriately shouted at the community generally and the Mastodon developers specifically. Her voice was reinforced by many others, some of whom sharpened the criticism by calling the Mastodon team whiteness-afflicted at best and racist at worst.

People asked a lot of questions and we learned a few things. First of all, It turns out that some attackers came from instances that are known to be toxic and should long-since have been defederated by Ms Crayton’s. Defederation is the Fediverse’s nuclear weapon, our best tool for keeping even the sloppiest admins on their toes. To the extent our tools work at all, they’re useless if they’re not applied.

But on the other hand it’s cheap and fast to spin up a single-user Mastodon instance that won’t get defederated until the slime-thrower has thrown slime.

Invisibility

What I’ve only now come to understand is that Mastodon helps griefers hide. Suppose you’re on instance A and looking at a post from instance B, which has a comment from an account on instance C. Whether or not you can see that comment… is complicated. But lots of times, you can’t. Let me excerpt a couple of remarks from someone who wishes to remain anonymous.

Thinking about how mastodon works in the context of all the poc i follow who complain constantly about racist harassment and how often i look at their mentions and how I’ve literally never seen an example of the abuse they’re experiencing despite actively looking for it.

It must be maddening to have lots of people saying horrible things to you while nobody who’d be willing to defend you can see anyone doing anything to you.

But also it really does breed suspicion in allies. I believe it when people say they’re being harassed, but when I’m looking for evidence of it on two separate instances and not ever seeing it? I have to step hard on the part of me that’s like … really?

Take-away 1

This is a problem that the Masto/Fedi community can’t ignore. We can honestly say that up till now, we didn’t realize how serious it was. Now we know.

Take-away 2

Let’s try to cut the Mastodon developers some slack. Here’s a quote from one, in a private chat:

I must admit that my mentions today are making me rethink my involvement in Mastodon

I am burning myself out for this project for a long time, not getting much in return, and now I am a racist because I dont fix racism.

I think it is entirely reasonable to disagree with the team, which is tiny and underfunded, on their development priorities. Especially after these last few days, it looks like a lot of people — me, for sure — failed to dive deep into the narrated experience of racist abuse. In the team’s defense, they’re getting yelled at all the time by many people, all of whom have strong opinions about their feature that needs to ship right now!

Conversations

One of the Black Fedi voices that most influences me is Mekka Okereke, who weighed in intelligently, from which this, on the subject of Ms Crayton:

She should not have to experience this

It should be easier for admins at DAIR, and across the whole Fediverse, to prevent this

Mekka has set up a meeting with the Mastodon team and says Ms Crayton will be coming along. I hope that turns out to be useful.

More good input

Let’s start with Marco Rogers, also known as @polotek@social.polotek.net. I followed Marco for ages on Twitter, not always agreeing with his strong opinions on Web/Cloud technology, but always enjoying them. He’s been on Mastodon in recent months and, as usual, offers long-form opinions that are worth reading.

He waded into the furore around our abuse problem, starting here, from which a few highlights.

I see a lot of the drama that is happening between people of color on the platform and the mastodon dev team. I feel like I need to help.

If people of color still find ourselves dependent on a small team of white devs to get what we want, that is a failure of the principles of the fediverse.

I want to know how I can find and support people that are aligned with my values. I want to enable those people to work on a platform that I can use. And we don't need permission from the mastodon team to do so. They're not in charge.

Mekka, previously mentioned, re-entered the fray:

If you run a Mastodon instance, and you don't block at least the minimum list of known terrible instances, and you have Black users, it's just a matter of time before your users face a hate brigade.

That's the only reason these awful instances exist. That's all they do.

Telling users "Just move to a better server!" is supremely unhelpful. It doesn't help the mods, and it doesn't help the users.

It needs to be easier. It's currently too hard to block them and keep up with the new ones.

And more; this is from Jerry Bell, one of the longest-lasting Fediverse builders (and I think the only person I’m quoting here who doesn’t present as Black). These are short excerpts from a long and excellent piece.

I am writing this because I'm tired of watching the cycle repeat itself, I'm tired of watching good people get harassed, and I'm tired of the same trove of responses that inevitably follows.

… About this time, the sea lions show up in replies to the victim, accusing them of embracing the victim role, trying to cause racial drama, and so on.

A major factor in your experience on the fediverse has to do with the instance you sign up to. Despite what the folks on /r/mastodon will tell you, you won't get the same experience on every instance.

What next?

I don’t know. But I feel a buzz of energy, and smart people getting their teeth into the meat of the problem.

Now I have thoughts to offer about moving forward.

Who are the enemy?

They fall into two baskets: Professional and amateur. I think the current Mastodon attackers are mostly amateurs. These are lonely Nazis, incels, channers, your basic scummy online assholes. Their organization is loose at best (“He’s pointing at her, so I will too”), and they’re typically not well-funded nor are they deep technical experts.

Then there are the pros, people doing this as their day job. I suspect most of those are working for nation states, and yes, we all know which nation states those probably are. They have sophisticated automation to help them launch armies of bots.

Here are some suggestions about potential fight-backs, mostly aimed at amateurs.

Countermeasure: Money

There’s this nonprofit called IFTAS which is working on tools and support structures for moderation. How about they start offering a curated allowlist of servers that it’s safe to federate with? How do you get on that list? Pay $50 to IFTAS, which will add you to the watchlist, and also to a service scanning your members’ posts for abusive stuff during your first month or so of operation.

Cue the howls of outrage saying “Many oppressed people can’t afford $50, you’re discriminating against the victims!” I suppose, but they can still get online at any of the (many) free-to-use instances. I think it’s totally reasonable to throw a $50 roadblock in the process of setting up a server.

In this world, what happens? Joe Incel sets up an instance at ownthelibs.nazi or wherever, pays his $50, and starts throwing slime. This gets reported and pretty soon, he’s defederated. Sure, he can do it again. But how many times is this basement-dweller willing to spend $50, leaving a paper trail each time just in case he says something that’s illegal to say in the jurisdiction where he lives? Not that many, I think?

Countermeasure: Steal from Pleroma

It turns out Mastodon isn’t the only Fediverse software. One of the competitors is Pleroma. Unfortunately, it seems to be the server of choice for our attackers, because it’s easy and cheap to set up. Having said that, its moderation facilities are generally regarded as superior to Mastodon’s, notably a subsystem called Message Rewrite Facility (MRF) which I haven’t been near but is frequently brought up as something that would be useful.

Countermeasure: Make reporting better

I report abusive posts sometimes, and, as a moderator for CoSocial.ca I see reports too. I think the “Report post” interface on many clients is weak, asking you unnecessary questions.

And when I get a report, it seems like half the time none of the abusive material is attached, and it takes me multiple clicks to look at the reported account’s feed, which feels like a pretty essential step.

Here’s how I’d like reporting to work.

There’s a single button labeled “Report this post”. When you click it, a popup says “Reported, thanks” and you’re done. Maybe it could query whether you want to block the user or instance, but it’s super important that the process be lightweight.
The software should pull together a report package including the reported post’s text and graphics. (Not just the URLs, because the attackers like to cover their tracks.) Also the attacker’s profile page. No report should ever be filed without evidence.

Countermeasure: Rules of thumb

Lauren offered this: Suppose a reply or mention comes in for someone on Instance A from someone on Instance B. Suppose Instance A could check whether anyone else on A follows anyone on B. If not, reject the incoming message. This would have to be a per-user not global setting, and I see it as a placeholder for a whole class of heuristics that could usefully get in the attackers’ way.

Wish us luck

Obviously I’m not claiming that any of these ideas are the magic bullet that’s going to slay the online-abuse monster. But we do need ideas to work with, because it’s not a monster that we can afford to ignore.

I care intensely about this, because I think decentralization is an essential ingredient of online conversation, and online conversation is valuable, and if we can’t make it safe we won’t have it.

Union of Finite Automata 28 Jul 2024, 9:00 pm

In building Quamina, I needed to compute the union of two finite automata (FAs). I remembered from some university course 100 years ago that this was possible in theory, so I went looking for the algorithm, but was left unhappy. The descriptions I found tended to be hyper-academic, loaded with mathematical notation that I found unhelpful, and didn’t describe an approach that I thought a reasonable programmer would reasonably take. The purpose of this ongoing entry is to present a programmer-friendly description of the problem and of the algorithm I adopted, with the hope that some future developer, facing the same problem, will have a more satisfying search experience.

There is very little math in this discussion (a few subscripts), and no circles-and-arrows pictures. But it does have working Go code.

Finite automata?

I’m not going to rehash the theory of FAs (often called state machines). In practice the purpose of an FA is to match (or fail to match) some input against some pattern. What the software does when the input matches the pattern (or doesn’t) isn’t relevant to our discussion today. Usually the inputs are strings and the patterns are regular expressions or equivalent. In practice, you compile a pattern into an FA, and then you go through the input, character by character, trying to traverse the FA to find out whether it matches the input.

An FA has a bunch of states, and for each state there can be a list of input symbols that lead to transitions to other states. What exactly I mean by “input symbol” turns out to be interesting and affects your choice of algorithm, but let’s ignore that for now.

The following statements apply:

One state is designated as the “start state” because, well, that’s where you start.
Some states are called “final”, and reaching them means you’ve matched one or more patterns. In Quamina’s FAs, each state has an extra field (usually empty) saying “if you got here you matched P*, yay!”, where P* is a list of labels for the (possibly more than one) patterns you matched.
It is possible that you’re in a state and for some particular input, you transition to more than one other state. If this is true, your FA is nondeterministic, abbreviated NFA.
It is possible that a state can have one or more “epsilon transitions”, ones that you can just take any time, not requiring any particular input. (I wrote about this in Epsilon Love.) Once again, if this is true, you’ve got an NFA. If neither this statement nor the previous are true, it’s a deterministic finite automaton, DFA.

The discussion here works for NFAs, but lots of interesting problems can be solved with DFAs, which are simpler and faster, and this algorithm works there too.

Union?

If I have FA1 that matches “foo” and FA2 that matches “bar”, then their union, FA1 ∪ FA2, matches both “foo” and “bar”. In practice Quamina often computes the union of a large number of FAs, but it does so a pair at a time, so we’re only going to worry about the union of two FAs.

The academic approach

There are plenty of Web pages and YouTubes covering this. Most of them are full of Greek characters and math symbols. They go like this:

You have two FAs, call them A and B. A has states A₁, … A_maxA, B has B₁, … B_maxB
The union contains all the states in A, all the states in B, and the “product” of A and B, which is to say states you could call A₁B₁, A₁B₂, A₂B₁, A₂B₂, … A_maxAB_maxB.
For each state A_XB_Y, you work out its transitions by looking at the transitions of the two states being combined. For some input symbol, if A_X has a transition to A_XX but B_Y has no transition, then the combined state just has the A transition. The reverse for an input where B_Y has a transition but A_X doesn’t. And if A_X transitions to A_XX and B_Y transitions to B_YY, then the transition is to A_XXB_YY.
Now you’ll have a lot of states, and it usually turns out that many of them aren’t reachable. But there are plenty of algorithms to filter those out. You’re done, you’ve computed the union and A₁B₁ is its start state!

Programmer-think

If you’re like me, the idea of computing all the states, then throwing out the unreachable ones, feels wrong. So here’s what I suggest, and has worked well in practice for Quamina:

First, merge A₁ and B₁ to make your new start state A₁B₁. Here’s how:
If an input symbol causes no transitions in either A₁ or B₁, it also doesn’t cause any in A₁B₁.
If an input symbol causes a transition in A₁ to A_X but no transition in B₁, then you adopt A_X into the union, and any other A states it points to, and any they point to, and so on.
And of course if B₁ has a transition to B_Y but A₁ doesn’t transition, you flip it the other way, adopting B_Y and its descendents.
And if A₁ transitions to A_X and B₁ transitions to B_Y, then you adopt a new state A_XB_Y, which you compute recursively the way you just did for A₁B₁. So you’ll never compute anything that’s not reachable.

I could stop there. I think that’s enough for a competent developers to get the idea? But it turns out there are a few details, some of them interesting. So, let’s dig in.

“Input symbol”?

The academic discussion of FAs is very abstract on this subject, which is fair enough, because when you’re talking about how to build, or traverse, or compute the union of FAs, the algorithm doesn’t depend very much on what the symbols actually are. But when you’re writing code, it turns out to matter a lot.

In practice, I’ve done a lot of work with FAs over the years, and I’ve only ever seen four things used as input symbols to drive them. They are:

Unicode “characters” represented by code points, integers in the range 0…1,114,111 inclusive.
UTF-8 bytes, which have values in the range 0…244 inclusive.
UTF-16 values, unsigned 16-bit integers. I’ve only ever seen this used in Java programs because that’s what its native char type is. You probably don’t want to do this.
Enum values, small integers with names, which tend to come in small collections.

As I said, this is all I’ve seen, but 100% of the FAs that I’ve seen automatically generated and subject to set-arithmetic operations like Union are based on UTF-8. And that’s what Quamina uses, so that’s what I’m going to use in the rest of this discussion.

Code starts here

This comes from Quamina’s nfa.go. We’re going to look at the function mergeFAStates, which implements the merge-two-states logic described above.

Lesson: This process can lead to a lot of wasteful work. Particularly if either or both of the states transition on ranges of values like 0…9 or a…z. So we only want to do the work merging any pair of states once, and we want there only to be one merged value. Thus we start with a straightforward memo-ization.

func mergeFAStates(state1, state2 *faState, keyMemo map[faStepKey]*faState) *faState {
    // try to memo-ize
    mKey := faStepKey{state1, state2}
    combined, ok := keyMemo[mKey]
    if ok {
        return combined
    }

Now some housekeeping. Remember, I noted above that any state might contain a signal saying that arriving here means you’ve matched pattern(s). This is called fieldTransitions, and the merged state obviously has to match all the things that either of the merged states match. Of course, in the vast majority of cases neither merged state matched anything and so this is a no-op.

    fieldTransitions := append(state1.fieldTransitions, state2.fieldTransitions...)

Since our memo-ization attempt came up empty, we have to allocate an empty structure for the new merged state, and add it to the memo-izer.

    combined = &faState{table: newSmallTable(), fieldTransitions: fieldTransitions}
    keyMemo[mKey] = combined

Here’s where it gets interesting. The algorithm talks about looking at the inputs that cause transitions in the states we’re merging. How do you find them? Well, in the case where you’re transitioning on UTF-8 bytes, since there are only 244 values, why not do the simplest thing that could possibly work and just check each byte value?

Every Quamina state contains a table that encodes the byte transitions, which operates like the Go construct map[byte]state. Those tables are implemented in a compact data structure optimized for fast traversal. But for doing this kind of work, it’s easy to “unpack” them into a fixed-sized table; in Go, [244]state. Let’s do that for the states we’re merging and for the new table we’re building.

    u1 := unpackTable(state1.table)
    u2 := unpackTable(state2.table)
    var uComb unpackedTable

uComb is where we’ll fill in the merged transitions.

Now we’ll run through all the possible input values; i is the byte value, next1 and next2 are the transitions on that value. In practice, next1 and next2 are going to be null most of the time.

    for i, next1 := range u1 {
        next2 := u2[i]

Here’s where we start building up the new transitions in the unpacked array uComb.

For many values of i, you can avoid actually merging the states to create a new one. If the transition is the same in both input FAs, or if either of them are null, or if the transitions for this value of i are the same as for the last value. This is all about avoiding unnecessary work and the switch/case structure is the result of a bunch of profiling and optimization.

        switch {
        case next1 == next2: // no need to merge
            uComb[i] = next1
        case next2 == nil: // u1 must be non-nil
            uComb[i] = next1
        case next1 == nil: // u2 must be non-nil
            uComb[i] = next2
        case i > 0 && next1 == u1[i-1] && next2 == u2[i-1]: // dupe of previous step - happens a lot
            uComb[i] = uComb[i-1]

If none of these work, we haven’t been able to avoid merging the two states. We do that by a recursive call to invoke all the logic we just discussed.

There is a complication. The automaton might be nondeterministic, which means that there might be more than one transition for some byte value. So the data structure actually behaves like map[byte]*faNext, where faNext is a wrapper for a list of states you can transition to.

So here we’ve got a nested loop to recurse for each possible combination of transitioned-to states that can occur on this byte value. In a high proportion of cases the FA is deterministic, so there’s only one state from each FA being merged and this nested loop collapses to a single recursive call.

        default: // have to recurse & merge
            var comboNext []*faState
            for _, nextStep1 := range next1.states {
                for _, nextStep2 := range next2.states {
                    comboNext = append(comboNext, mergeFAStates(nextStep1, nextStep2, keyMemo))
                }
            }
            uComb[i] = &faNext{states: comboNext}
        }
    }

We’ve filled up the unpacked state-transition table, so we’re almost done. First, we have to compress it into its optimized-for-traversal form.

    combined.table.pack(&uComb)

Remember, if the FA is nondeterministic, each state can have “epsilon” transitions which you can follow any time without requiring any particular input. The merged state needs to contain all the epsilon transitions from each input state.

    combined.table.epsilon = append(state1.table.epsilon, state2.table.epsilon...)

    return combined
}

And, we’re done. I mean, we are once all those recursive calls have finished crawling through the states being merged.

Is that efficient?

As I said above, this is an example of a “simplest thing that could possibly work” design. Both the recursion and the unpack/pack sequence are kind of code smells, suggesting that this could be a pool of performance quicksand.

But apparently not. I ran a benchmark where I added 4,000 patterns synthesized from the Wordle word-list; each of them looked like this:

{"allis": { "biggy": [ "ceils", "daisy", "elpee", "fumet", "junta", … (195 more).

This produced a huge deterministic FA with about 4.4 million states, with the addition of these hideous worst-case patterns running at 500/second. Good enough for rock ’n’ roll.

How about nondeterministic FAs? I went back to that Wordle source and, for each of its 12,959 words, added a pattern with a random wildcard; here are three of them:

{"x": [ {"shellstyle": "f*ouls" } ] } {"x": [ {"shellstyle": "pa*sta" } ] } {"x": [ {"shellstyle": "utter*" } ] }

This produced an NFA with 46K states, the addition process ran at 70K patterns/second.

Sometimes the simplest thing that could possibly work, works.

Page processed in 0.33 seconds.

ongoing by Tim Bray