Weeknotes: Week 1 - Q1 - 2025

Hello - Happy New Year. I hope you had a good Christmas if you celebrate that kind of thing. We do in the Skinner house, the tree has been up (and now down), presents have been exchanged and lots of food eaten. The weather over the last few weeks has also been glorious, super hot and in-dispersed with just the right amount of downpours to, so far at least, keep any threat of fire at bay so I've been down the beach a lot. I love this time of year. I hope you're having a lovely time too.

What day is it again?

Technically I am still on Holiday this week and i'm happy to report I have no idea what day it is unless I really think about it. While i've been out of the office i've been doing a few things in between the beach, DIY and eating to get ready for the year ahead. I'm not bothering with resolutions this year but I have been thinking about how I want to approach the year ahead intentionally and positively. The last couple of years for various reasons have somewhat disappointingly been neither of these. I work best when I have structure and I know what I am doing. My company More Than Machines has been in many ways pretty successful to date despite me treating it as somewhat of a passion project but as I go into it's fourth year of existence I want to start to be a bit more deliberate with it. More deliberate with it's focus as well as it's outcomes and I want to hold myself more accountable for doing what I say I will do.

Removing friction

One of the books I have been reading over the Christmas break has been Malcolm Gladwell's Revenge of the Tipping Point. I won't post a review or anything as, as I understand it nobody cares but it has got me thinking about broken windows theory again. There are lots of broken Windows in MTM as well as in my own personal tech stack and in this first quarter I plan to fix as many of these as possible so that I may focus on work and at the moment, more importantly winning work and selling software. Fixing these niggling points of friction, especially the ones that can be fixed with technology, also really aligns with my ideas around boring ai. You can't tell but I started with this blog, it's had a good shakedown and much of the stuff that wasn't working well has been fixed up. The code that powers this blog also powers the MTM site so i'll be moving the changes over there in the next week too.

Benchmarking LLMS

As not much has really happened this week I thought i'd start to tell you about one of my holiday projects. I've been mucking about with LLM benchmarking their ability to provide good answers to agronomic questions, specifically Australian agronomic questions for a while now. Following a failed funding attempt to look at this in a lot more detail in the coming yearI decided to revisit my code over Christmas. More to report in the coming weeks but I have revisited the question set I build using the GRDC grownotes and made quite a few improvements to the benchmarking code. It's not ready to share yet as I am still working through the questions but it's been a fun distraction over the break and should be super useful coming into the new year. Especially as by all accounts this is to be the year that AI really starts to make an impact. Actually, something to talk about is probably the new tool I built to help me sift through the questions. The questions and their multiple choice answers are held in LLM generated JSON (for the time being I have taken the dataset down from huggingface) and there are a lot of them so going thorugh them all has been a really onerous task that frankly I haven't done. To help with this I have built a little web tool that displays the question and then let's me provide an answer to the question. I can also mark a question as rubbish or nonsensical which quite a few of them seem to be. I generated the questions with an early LLAMA model on my laptop and I think it may have had more issues than I anticipated. Anyway It's made the task much easier. I intend to make the tool public once i've had a bit more time with it so that I can crowdsource the review of the questions.

Building tools

Sticking to my promise of fixing the boring stuff, I thought I would try and share a tool each week that I have made to make my life a little easier. This week i'll just point you to my news headline CLI script. I have a few more i've made that i'll share over time.

The End

That's it - future versions of this will hopefully have a bit more interesting content as the year starts to pick up (can you believe we're 1/4 of the way through the century already - the year 2000 still seems very recent to me). I've a lot of fun things planned already. For everyone heading back to work tomorrow, I hope it goes well.

< Back