| Home | GitHub | Twitter | 
About Me | 
        
I'm currently working as an NLP Engineer @ aveni.
I have a PhD from Heriot-Watt University, where my research was on applying natural language processing-based machine learning algorithms to programming languages.
Prior to that, I obtained a First-Class Master's degree (MEng) in Digital Electronics (now known as Electronics and Computer Engineering) from the University of Sheffield.
In my spare time I enjoy: reading, listening to podcasts, playing board games, walking my dog, and watching sports.
You can contact me on twitter or via e-mail.
Things I've learned, formatted as Python notebooks.
| 2024-12-15 | Making LLMs smarter with structured generation and Outlines | 
| 2024-02-17 | Introduction to information theory (from a machine learning perspective) | 
| 2024-02-10 | Decision trees and random forests from scratch | 
| 2023-10-03 | Bootstrapping confidence intervals | 
| 2023-09-27 | Parallelism in Python | 
| 2023-09-07 | Some practical use cases for the walrus operator | 
| 2023-08-28 | Retrieval augmented generation | 
| 2023-08-27 | Implementing TF-IDF from scratch | 
| 2023-08-06 | A/B testing with the chi-squared test | 
| 2023-08-03 | Fine-tuning a sequence classification model with LoRA and the peft library | 
| 2023-07-31 | How to use Open AI's "function calling" ability | 
| 2023-07-30 | What's the best way to strip punctuation from a string? | 
| 2023-07-29 | Why is the softmax function "off by one"? | 
Things that I've worked on.
| 2025-08 | The Castles of Burgundy Automata: An implementation of an Automata to play the The Castles of Burgundy board game solo. | 
| 2025-07 | Forest Shuffle Automata: An implementation of an Automata to play the Forest Shuffle card game solo. | 
| 2025-04 | 
                    repository-template: A repository template for Python projects, using:
                    uv, ruff, pre-commit,
                    and pytest.
                 | 
            
| 2024-04 | clip-search: Text-to-image search with OpenCLIP, Docker, Flask, Faiss, etc. and a basic front-end. | 
| 2024-04 | hackerdaily2hackernews: A Google Chrome extension that redirects from HackerDaily comments to Hacker News comments. | 
| 2022-10 | lexisearch: Retrieval augmented generation on transcriptions from the Lex Fridman Podcast. | 
| 2022-02 | GloVedle: A version of Wordle using GloVe embeddings. | 
| 2022-02 | Wordle Terminal: A Python port of Wordle which is played in the terminal. | 
| 2022-01 | UUIDv4 Generator: Generate UUIDv4s in the browser, because I didn't like the way the top result on Google generated them. | 
| 2021-XX | Sentiment Analysis Tutorial: A tutorial on how to implement some common deep learning based sentiment analysis (text classification) models in PyTorch with torchtext, specifically the NBOW, GRU, bi-LSTM, CNN and Transformer models. Somehow got popular and has quite a few stars. | 
| 2021-XX | Sequence-to-Sequence Learning Tutorial: A tutorial implementing neural (deep learning based) sequence-to-sequence models in PyTorch with torchtext, by implementing six NMT papers. Also has quite a few stars and was used as a basis for the official PyTorch language translation tutorial. | 
| 2021-08 | easynlp: A library for performing natural language processing - such as zero-shot classification, translation, named entity recognition, summarization, and question answering - inference on given data utilizing the pre-trained models from transformers. | 
| 2021-02 | Image Classification Tutorial: A tutorial covering how to implement some deep learning computer vision models in PyTorch with torchvision. Covers: a basic multi-layer perceptron, LeNet, AlexNet, VGG and ResNet. | 
| 2021-01 | A Tour of Optimizers: A tutorial on common optimization algorithms used for neural networks, including: SGD, Adagrad, Adadelta, RMSprop and Adam. | 
| 2020-10 | numberworld: A reinforcement learning toy environment for task-oriented language grounding. | 
| 2020-07 | Glyphs of Dialogue: A small project combining ideas from GlyphNet and Dimensions of Dialogue. | 
| 2020-03 | Recurrent Attention Model: A PyTorch implementation of the Recurrent Attention Model from the Recurrent Models of Visual Attention paper. | 
| 2019-03 | countworld: Generating synthetic datasets that deal with counting. | 
| 2016 | snake: Snake in Javascript. | 
| 2016 | difference: A timed mental maths game in Javascript. | 
| Top |