Interesting reads, and some thoughts, - week 2

Linked Text:

Interesting reads, and some thoughts, - week 2 - 2024

On the back of Matt Webb’s post - interconnected.org - I’ve started moving my reading over to RSS. As a result, I’m getting far more coverage of interesting things, at what feels far less effort than before.

Here are a bunch of interesting things from the last week.

Affiliation Bias in Peer Review of Abstracts by a Large Language Model (Paywall). JAMA Network The authors used GPT3.5. They asked GTP to decide accept or reject on papers, and compared giving the papers with, or without affiliation. When affiliations were introduced, there was a mild positive bias for authors from top tier institutes, but only at about a percentage level, and the bias is way smaller than human bias given the same data. The result is that GPT3.5 is a lot less biased than humans.

How do you make RAG more robust, and stop hallucinations? This paper get accuracy rates up to about 97% by firstly having a fact base under the RAG, and secondly having a step in the prompt that asks the service to double check it’s facts. Conclusion - it’s possible to engineer systems to eliminating hallucination, in some circumstances. Simon Willison

The following is a great post. Scholarly Kitchen. Go read it. Here are my favourite quotes from the post.

“Unfortunately, the narrator does not confront the problem that the contents of the Library have absolutely no relationship to any notion of truth.”

“But I believe that as sources of information, large language models are more like the Library than they are other analogies we may reach for. They are not, for example, like calculators. I trust a hand calculator to give me consistently correct responses. And even for those calculations that are not fully correct – such as when a repeating decimal is rounded off – I trust the combination of the calculator’s result alongside my own knowledge of its heuristics to get me to the answer that suits my need. In other words, when a calculator is incorrect, it is incorrect in consistent and predictable ways. By contrast, when an LLM generates incorrect answers, it may do so much less predictably.”

“Nor is an LLM like a search engine”

“But if individuals are growing more comfortable turning to generative AI as a source of information, we should be careful not to fall into the same trap as the librarians of Babel, who mistakenly believe that a true catalogue generated by accident is more valuable than an incomplete one created based on knowledge from the librarians themselves.”

“The collectors of these datasets are groping towards the infinite, or at least a complete corpus of all human writing. When considering generative AI as a sociocultural phenomenon, whether they get there is beside the point. The text going into LLMs already feels as infinite to us as the books of the Library of Babel feel to the librarians.”

At the heart of what is going on is that we are creating engines that have the ability for us to compose together any ideas, or requests of the, which in may cases allow them, Genie-like, to perform minor miracles for us. But nothing of what they do is connected directly to reality, and so we are creating engines of imaginative possibility, and plausibility, and perhaps engines that allow us to explore previously hard to access spaces, but we need to take those dreams, and birth them into the world, and use them perhaps to reshape our view of the world, but nonetheless be the worldly agents ourselves that orchestrate all of this.

The next two links are via the great newsletter Journalology (Journalology):

Large Language Publishing This piece is fantastic, though I very much believe it is deeply flawed.

IOP donates to r4l IOP donates revenues from retracted papers to charity. This is a great move, but it does not set a precedent, and I think we do not have a set agreement of what the ethical way is to act in scenarios like this one.

Cactus Communications and Taylor & Francis Pioneer Collaborative AI Solution I like this because it shows that these solutions are starting to be productised within the publishing industry.

The New England Journal of Medicine AI The New England Journal of Medicine AI journal support and encourage the use of LLMs, they had better!

Department of Uh-Oh economic research edition Marginal revolution point to this paper “## Selective and (mis)leading economics journals: Meta-research evidence” Askarov/Zohid. Basically, tons of economics papers have dodgy statistics.

Text embeddings reveal almost as much as text This is very important. Much of the current technology is built on top of vector databases. Simon points to a paper that shows given the vector database you can get back to the original content.

Withings Beam-o This is a small, usable, multi-sensor for health from Withings. I’ve been using a Withings smart weighing scales for a number of years, so my experience is that they can deliver on products like this. What is interesting to me about this, is that you couple devices like this with the delivery of tele medicine, and all of a sudden the potential quality of telemedicine jumps up a good few notches - if the doctor can trust the device, and the data coming from the device.

Google research paper Google research just dropped a paper. Training an LLM to act as a conversational agent that can diagnose medical condition. In very tightly controlled conditions with medical actors, and doctors, it did better than medical professionals. They setup some clever methods of double reinforcement learning.

Scammy AI-generated books flooding Amazon Along with AI generated images of politicians, and deep fakes, this is a vector of flooding actual authors and books.

Here is some more fun stuff:

ABC Paper This paper - the paper itself, is a compiler for the C89 language.
Drawing Garden - draw a garden in your browser.
PakuPaku - one dimensional pac man.
AI Robot Bests Marble Maze Game - robots beat marble maze
Blindboy Boatclub interview