Ian Mulvany

December 21, 2023

Interesting LLM Papers - December 2023

Some interesting LLM papers

This is my take on papers from Davis Blaclock (https://dblalock.substack.com).

This paper uses an LLM to rank response from a fine tuning run on a small model (7B parameters) to get it to perform in a more aligned way than a large model (70B) that has been fine tuned using RLHF. This is important because LLM based optimisation is easier and to run and cheaper that RLFH, and being able to get small models to perform better than large models in specific contexts is also more broadly useful. https://arxiv.org/abs/2310.16944

There are tons of other papers in his writeup (https://dblalock.substack.com/p/2023–11–26-arxiv-roundup-big-potential) that show potentially significant performance and efficiency gains to be made.

Main takeaway - we remain far from the point of exhausting all optimisation paths for LLMs.

References: Tunstall, L., Beeching, E., Lambert, N., Rajani, N., Rasul, K., Belkada, Y., Huang, S., Werra, L., Fourrier, C., Habib, N., Sarrazin, N., Sanseviero, O., Rush, A. & Wolf, T. (2023). Zephyr: Direct Distillation of LM Alignment. arXiv: 2310.16944

Classifications from OpenAI: artificial intelligence, machine learning, optimization techniques, model tuning, academic papers, computer science 004.

About Ian Mulvany

Hi, I'm Ian - I work on academic publishing systems. You can find out more about me at mulvany.net. I'm always interested in engaging with folk on these topics, if you have made your way here don't hesitate to reach out if there is anything you want to share, discuss, or ask for help with!