The commercial, closed-source models have significantly advanced beyond current open-source alternatives. Beyond the pure size of the model parameters, companies like Anthropic and OpenAI have effectively mastered the process of ‘scaffolding’ these models to specialise in particular tasks. Where models like Deepseek, Llama and Mistral is failing (as a consumer product) is that you cannot get features like "Deep Research" or a "Custom GPT" or "Web Search" baked into the overall experience for an average consumer.
It is entirely possible that some of the companies developing these open-source foundational models don't have aspirations of becoming a consumer product, and that's absolutely fine. But that presents a very chilling perspective of how the layman consumer will surely start shifting towards these all-in-one products (ChatGPT and Claude) powered by closed, commercial models.
Here's what's going to happen. These companies will soon realise that the "useful" internet is already trained on, synthetic data lacks a certain degree of variety and proprietary data has become unavailable for use, outside of contracts, due to strict regulations.
And when that happens, they will come for your thoughts. They will start training on your brainstorming sessions, those ideas that you feel might change the world some day, that breakthrough algorithm that you wanted ChatGPT to incorporate into a web app.
Somewhere in the fine print of their privacy policies, a subtle entry will be made allowing them to legally train on your data and let's face it, you or I won't have the legal strength to fight these companies.
As of today, you might feel that interacting with your ChatGPT voice assistant lets you share intimate details that might be too embarrassing for you to share with another human. But these details go into a profile maintained by these companies, definitely accessible by more folks that your best friend. And one day, when the time is right, this profile will be used to nudge you towards certain services and products camouflaged as the helpful suggestion of your trusted (and quite possibly ONLY) friend, ChatGPT.
It is entirely possible that some of the companies developing these open-source foundational models don't have aspirations of becoming a consumer product, and that's absolutely fine. But that presents a very chilling perspective of how the layman consumer will surely start shifting towards these all-in-one products (ChatGPT and Claude) powered by closed, commercial models.
Here's what's going to happen. These companies will soon realise that the "useful" internet is already trained on, synthetic data lacks a certain degree of variety and proprietary data has become unavailable for use, outside of contracts, due to strict regulations.
And when that happens, they will come for your thoughts. They will start training on your brainstorming sessions, those ideas that you feel might change the world some day, that breakthrough algorithm that you wanted ChatGPT to incorporate into a web app.
Somewhere in the fine print of their privacy policies, a subtle entry will be made allowing them to legally train on your data and let's face it, you or I won't have the legal strength to fight these companies.
As of today, you might feel that interacting with your ChatGPT voice assistant lets you share intimate details that might be too embarrassing for you to share with another human. But these details go into a profile maintained by these companies, definitely accessible by more folks that your best friend. And one day, when the time is right, this profile will be used to nudge you towards certain services and products camouflaged as the helpful suggestion of your trusted (and quite possibly ONLY) friend, ChatGPT.
Personally, I have started using Open WebUI and locally hosted models via Ollama for a vast majority of my brain dump needs and if you're a tech-savvy person, I think you should do the same. Research over the web still seems very tough without using a tool like Perplexity.
I've started devoting some time to figuring out ways to let an average person utilise powerful open-sourced AI in a private manner without overpowering their locally available infra (which in most cases is a laptop/phone). There are some approaches in mind, I'd probably cover them in a future post.
The future of AI needs to be secure and private. Your activities are already being tracked across the web, but your thoughts need to be free from analysis or subjugation.
I've started devoting some time to figuring out ways to let an average person utilise powerful open-sourced AI in a private manner without overpowering their locally available infra (which in most cases is a laptop/phone). There are some approaches in mind, I'd probably cover them in a future post.
The future of AI needs to be secure and private. Your activities are already being tracked across the web, but your thoughts need to be free from analysis or subjugation.