Editorials
ChatGPT – Take it to the EDGE
POSTED 11/11/2024
Will Open Source AI outpace OpenAI (ChatGPT) and Google (Bard)? Well, that’s what the leaked memo from an employee of Google believes after seeing the rapid pace of development by the open-source community on Meta’s leaked LLaMA foundational model.
It’s frightening to Google and OpenAI just how fast the open-source community was able to catch up with the performance of ChatGPT, and at a much lower cost. They state that the training for the Vicuna-13B was around $300.
What’s interesting here is how the evolution of conversational AI with Large Language Models (LLM) such as these can quickly reach the performance of their highly funded partners/competitors but with much less CPU power. This is because the approach taken is stackable, meaning that the model does not need to be trained from scratch every time a new release is ready. It simply builds on the previous version and improves the outcome.
For this first wave of models, that foundation was Meta’s LLaMA and EleutherAI, a non-profit version. Now that Google and OpenAI see the threat from startups and the open-source community, they may start to curb access and that would limit the potential of the technology to a few mega companies. That will slow innovation and the pace of development, but now the cat is out of the bag, I doubt it will change the outcome and will probably just focus the community on the non-profit platforms. These will be behind initially due to the high cost of creating the foundational models, but with AI investors, startups, and the open-community resources and expertise focused on it, new approaches will evolve. This will reduce the reliance on any one model and probably lead to foundational model domain expertise in a particular approach or data set with companies choosing the most appropriate model for the application.
One area where we see the value of this lighter model approach evolving is Edge AI. By utilizing much smaller open-source models that run on an edge CPU device platform and follow a Web 3.0 architecture, many new use cases will start to appear that are more efficient, faster, and cost-effective to run.
Edge AI processing will become increasingly important as the demand for AI increases. According to Qualcomm’s white paper, The Future of AI is Hybrid, it shows how mobile devices can run many different generative AI models and model sizes. As we move from AI training, which requires massive amounts of compute power and may only be done once or twice a year, to AI inference, which executes the models from an active user query. It is clear that to make this all economically viable, the inference model needs to be run at the edge to reduce the cost of processing such a high number of daily active users. Running these inference models in the cloud results in very high costs that are simply not scalable.
Be sure to check out our first Chat GPT blog post here!