Giving command-tab a break
The other day while waiting for Claude to process prompts, I watched this video: (I know, the irony isn’t lost on me) It highlights the…
In the AI of age, we need GitHub to allow LLMs to upload screenshots to pull requests without weird workarounds. This morning Claude was working in a repo that didn’t have screenshot upload skills added. When it had trouble attaching screenshots to a PR comment, it went ahead and uploaded those screenshots to imgur.com without consulting me first. That’s a big-old yikes 😂😱
argh, I’m sad that yet another tool I want to pay for is going to be putting money in Musk’s pocket. I was leaning on Cursor for personal work, but it’s time to find something else.
https://www.cnbc.com/2026/06/16/spacex-spcx-cursor-acquisition-ipo.html
When there’s this much money and power at stake, I suspect there aren’t really any big companies that are the “good guys”, but when Anthropic is spending money and attention on things like this, it does a lot to sway my opinion:
I want to trust AI with code enough that I’m no longer doing line-by-line reviews. But when I look at the code, even from frontier models, I’m still finding small things that feel smelly.
In a simple PR Sonnet 4.6 made, I just found it changing the default mock data constructor for all existing test cases away from the common case to a less-common scenario. All the existing tests passed because they asserted on things that weren’t changed. The update was technically sound, but it’s the kind of change that would feel weird to a human if they came across it later.
Diving into test assertions is a good way to focus on the verifiability of a PR when doing AI-assisted reviews. Here’s one of the prompts I often use:
please give me a human-readable outline of the
tests we added specifically calling out what they
assert and how as an html doc