Some interesting LLM papers
This paper uses an LLM to rank response from a fine tuning run on a small model (7B parameters) to get it to perform in a more aligned way than a large model (70B) that has been fine tuned using RLHF. This is important because LLM based optimisation is easier and to run and cheaper that RLFH, and being able to get small models to perform better than large models in specific contexts is also more broadly useful. https://arxiv.org/abs/2310.16944
There are tons of other papers in his writeup (https://dblalock.substack.com/p/2023–11–26-arxiv-roundup-big-potential) that show potentially significant performance and efficiency gains to be made.
Main takeaway - we remain far from the point of exhausting all optimisation paths for LLMs.
References: Tunstall, L., Beeching, E., Lambert, N., Rajani, N., Rasul, K., Belkada, Y., Huang, S., Werra, L., Fourrier, C., Habib, N., Sarrazin, N., Sanseviero, O., Rush, A. & Wolf, T. (2023). Zephyr: Direct Distillation of LM Alignment. arXiv: 2310.16944
Classifications from OpenAI: artificial intelligence, machine learning, optimization techniques, model tuning, academic papers, computer science 004.