Read microblog post

In the AI of age, we need GitHub to allow LLMs to upload screenshots to pull requests without weird workarounds. This morning Claude was working in a repo that didn’t have screenshot upload skills added. When it had trouble attaching screenshots to a PR comment, it went ahead and uploaded those screenshots to imgur.com without consulting me first. That’s a big-old yikes 😂😱

Read microblog post

When there’s this much money and power at stake, I suspect there aren’t really any big companies that are the “good guys”, but when Anthropic is spending money and attention on things like this, it does a lot to sway my opinion:

Read microblog post

I want to trust AI with code enough that I’m no longer doing line-by-line reviews. But when I look at the code, even from frontier models, I’m still finding small things that feel smelly.

In a simple PR Sonnet 4.6 made, I just found it changing the default mock data constructor for all existing test cases away from the common case to a less-common scenario. All the existing tests passed because they asserted on things that weren’t changed. The update was technically sound, but it’s the kind of change that would feel weird to a human if they came across it later.

Read microblog post

Diving into test assertions is a good way to focus on the verifiability of a PR when doing AI-assisted reviews. Here’s one of the prompts I often use:

please give me a human-readable outline of the
tests we added specifically calling out what they
assert and how as an html doc