Large language models (LLMs) are typically optimized to answer peoples’ questions. But there is a trend toward models also being optimized to fit into agentic workflows. This will give a huge boost to agentic performance! Following ChatGPT’s breakaway success at answering questions, a lot of LLM development focused on providing a good consumer experience. So LLMs were tuned to answer questions (“Why did Shakespeare write Macbeth?”) or follow human-provided instructions (“Explain why Shakespeare wrote Macbeth”). A large fraction of the datasets for instruction tuning guide models to provide more helpful responses to human-written questions and instructions of the sort one might ask a consumer-facing LLM like those offered by the web interfaces of ChatGPT, Claude, or Gemini. But agentic workloads call on different behaviors. Rather than directly generating responses for consumers, AI software may use a model in part of an iterative workflow to reflect on its own output, use tools, write plans, and collaborate in a multi-agent setting. Major model makers are increasingly optimizing models to be used in AI agents as well. Take tool use (or function calling). If an LLM is asked about the current weather, it won’t be able to derive the information needed from its training data. Instead, it might generate a request for an API call to get that information. Even before GPT-4 natively supported function calls, application developers were already using LLMs to generate function calls, but by writing more complex prompts (such as variations of ReAct prompts) that tell the LLM what functions are available and then have the LLM generate a string that a separate software routine parses (perhaps with regular expressions) to figure out if it wants to call a function. Generating such calls became much more reliable after GPT-4 and then many other models natively supported function calling. Today, LLMs can decide to call functions to search for information for retrieval augmented generation (RAG), execute code, send emails, place orders online, and much more. Recently, Anthropic released a version of its model that is capable of computer use, using mouse-clicks and keystrokes to operate a computer (usually a virtual machine). I’ve enjoyed playing with the demo. While other teams have been prompting LLMs to use computers to build a new generation of RPA (robotic process automation) applications, native support for computer use by a major LLM provider is a great step forward. This will help many developers! [Reached length limit; full text: https://lnkd.in/gHmiM3Tx ]
Trends in Large Language Models
Explore top LinkedIn content from expert professionals.
Summary
Large language models (LLMs) are evolving to meet new demands, transitioning from consumer-focused applications to specialized, integrated systems that enhance workflows. Current trends include the development of more modular AI systems, improved contextual understanding, and the rise of models that utilize real-time data and broader modalities.
- Explore agentic AI applications: Look into how LLMs are being designed for iterative workflows, such as tool usage and collaborative problem-solving, rather than just answering questions.
- Adopt modular AI systems: Consider using compound AI systems that combine multiple models and tools to address specific business challenges efficiently and scalably.
- Prepare for evolving training data needs: Stay ahead of the potential data scarcity by exploring synthetic data, domain-specific datasets, and innovative training strategies to maintain and improve AI performance.
-
-
For the last couple of years, Large Language Models (LLMs) have dominated AI, driving advancements in text generation, search, and automation. But 2025 marks a shift—one that moves beyond token-based predictions to a deeper, more structured understanding of language. Meta’s Large Concept Models (LCMs), launched in December 2024, redefine AI’s ability to reason, generate, and interact by focusing on concepts rather than individual words. Unlike LLMs, which rely on token-by-token generation, LCMs operate at a higher abstraction level, processing entire sentences and ideas as unified concepts. This shift enables AI to grasp deeper meaning, maintain coherence over longer contexts, and produce more structured outputs. Attached is a fantastic graphic created by Manthan Patel How LCMs Work: 🔹 Conceptual Processing – Instead of breaking sentences into discrete words, LCMs encode entire ideas, allowing for higher-level reasoning and contextual depth. 🔹 SONAR Embeddings – A breakthrough in representation learning, SONAR embeddings capture the essence of a sentence rather than just its words, making AI more context-aware and language-agnostic. 🔹 Diffusion Techniques – Borrowing from the success of generative diffusion models, LCMs stabilize text generation, reducing hallucinations and improving reliability. 🔹 Quantization Methods – By refining how AI processes variations in input, LCMs improve robustness and minimize errors from small perturbations in phrasing. 🔹 Multimodal Integration – Unlike traditional LLMs that primarily process text, LCMs seamlessly integrate text, speech, and other data types, enabling more intuitive, cross-lingual AI interactions. Why LCMs Are a Paradigm Shift: ✔️ Deeper Understanding: LCMs go beyond word prediction to grasp the underlying intent and meaning behind a sentence. ✔️ More Structured Outputs: Instead of just generating fluent text, LCMs organize thoughts logically, making them more useful for technical documentation, legal analysis, and complex reports. ✔️ Improved Reasoning & Coherence: LLMs often lose track of long-range dependencies in text. LCMs, by processing entire ideas, maintain context better across long conversations and documents. ✔️ Cross-Domain Applications: From research and enterprise AI to multilingual customer interactions, LCMs unlock new possibilities where traditional LLMs struggle. LCMs vs. LLMs: The Key Differences 🔹 LLMs predict text at the token level, often leading to word-by-word optimizations rather than holistic comprehension. 🔹 LCMs process entire concepts, allowing for abstract reasoning and structured thought representation. 🔹 LLMs may struggle with context loss in long texts, while LCMs excel in maintaining coherence across extended interactions. 🔹 LCMs are more resistant to adversarial input variations, making them more reliable in critical applications like legal tech, enterprise AI, and scientific research.
-
Why Compound AI Systems Are Taking Over ⭐ We’re moving beyond single-model AI into an era where Compound AI Systems—modular, flexible, and powerful—are setting a new standard. But what does this mean? And why should AI leaders pay attention? 🔍 𝗪𝗵𝗮𝘁 𝗔𝗿𝗲 𝗖𝗼𝗺𝗽𝗼𝘂𝗻𝗱 𝗔𝗜 𝗦𝘆𝘀𝘁𝗲𝗺𝘀 Unlike traditional AI models that work in isolation, Compound AI Systems integrate multiple components—LLMs, retrieval mechanisms, external tools, and reasoning engines—to solve complex problems more effectively. Instead of relying on one massive model, these systems: ✔️ Combine multiple AI models for specialized tasks ✔️ Use retrieval mechanisms to fetch real-time, relevant data ✔️ Leverage external tools (APIs, databases, or symbolic solvers) to enhance reasoning ✔️ Improve adaptability by dynamically selecting the best approach for a given problem This modular approach enhances accuracy, efficiency, and scalability—giving AI systems the ability to think beyond their training data and operate more intelligently in real-world environments. 🏆 𝗪𝗵𝗲𝗿𝗲 𝗖𝗼𝗺𝗽𝗼𝘂𝗻𝗱 𝗔𝗜 𝗜𝘀 𝗪𝗶𝗻𝗻𝗶𝗻𝗴 ↳ Google’s AlphaCode 2 Generates millions of programming solutions, then intelligently filters out the best ones—resulting in dramatic improvements in AI-driven code generation. ↳ AlphaGeometry Combines a large language model (LLM) with a symbolic solver, enabling AI to solve complex geometry problems at an expert level. ↳ Retrieval-Augmented Generation (RAG) Now a standard in enterprise AI, RAG models retrieve relevant data in real-time before generating responses, significantly boosting accuracy and contextual relevance. ↳ Multi-Agent Systems Startups and research labs are developing AI "teams"—where multiple models communicate and collaborate to solve problems faster and more efficiently than a single model could. 💡 𝗪𝗵𝘆 𝗜𝗻𝗱𝘂𝘀𝘁𝗿𝘆 𝗟𝗲𝗮𝗱𝗲𝗿𝘀 𝗔𝗿𝗲 𝗕𝗲𝘁𝘁𝗶𝗻𝗴 𝗕𝗶𝗴 𝗼𝗻 𝗖𝗼𝗺𝗽𝗼𝘂𝗻𝗱 𝗔𝗜 This isn’t just a research trend. It’s an industry-wide shift. ↳ Microsoft, IBM, and Databricks are already pivoting their AI strategies toward modular, system-based AI architectures. ↳ Fireworks AI is leading the GenAI inference platform with Compound AI Systems ↳ Even OpenAI’s CEO, Sam Altman, emphasized the transition: "We’re going to move from talking about models to talking about systems." 𝗧𝗵𝗲 𝗕𝗶𝗴 𝗧𝗮𝗸𝗲𝗮𝘄𝗮𝘆 𝗳𝗼𝗿 𝗔𝗜 𝗟𝗲𝗮𝗱𝗲𝗿𝘀 The implications are massive: ✔️ AI performance will increasingly depend on system design—not just model size ✔️ Custom AI solutions will become the norm, allowing businesses to tailor AI systems for specific needs ✔️ Efficiency will skyrocket, as compound systems reduce computational waste by dynamically choosing the best approach for a given task ----------------------- Share this with your network ♻️ Follow me (Aishwarya Srinivasan) for more AI insights, news, and educational resources to keep you up-to-date about the AI space!
-
As artificial intelligence continues its meteoric rise, we often hear about breakthroughs and new capabilities. But what if the next big challenge isn’t just technical, but about something more fundamental — running out of data? A recent report highlights a looming bottleneck: by 2028, AI developers may exhaust the stock of public online text available for training large language models (LLMs). The rapid growth in model size and complexity is outpacing the slow expansion of usable Internet content, and tightening restrictions on data usage are only compounding the problem. What does this mean for the future of AI? Good piece in Nature outlining some of the key advances in the field. 1️⃣ Shift to Specialized Models: The era of “bigger is better” may give way to smaller, more focused models, tailored to specific tasks. 2️⃣ Synthetic Data: Companies like OpenAI are already leveraging AI-generated content to train AI — a fascinating, but potentially risky, feedback loop. 3️⃣ Exploring New Data Types: From sensory inputs to domain-specific datasets (like healthcare or environmental data), innovation in what counts as “data” is accelerating. 4️⃣ Rethinking Training Strategies: Re-reading existing data, enhancing reinforcement learning, and prioritizing efficiency over scale are paving the way for smarter models that think more deeply. This challenge isn’t just technical; it’s ethical, legal, and creative. Lawsuits from content creators highlight the delicate balance between innovation and intellectual property rights. Meanwhile, researchers are pushing the boundaries of what’s possible with less. Link to piece here: https://lnkd.in/gvRvxJZq
-
As a PhD student in Machine Learning Systems (MLSys), my research focuses on making LLM/GenAI serving and training more efficient. Over the past few months, I’ve come across some cool papers that keep shifting how I see this field. So, I put together a curated list to share with you all: https://lnkd.in/gYjBqVPt This list has a mix of academic papers, tutorials, and projects on GenAI systems. Whether you’re a researcher, a developer, or just curious about GenAI Systems, I hope it’s a useful starting point. The field moves fast, and having a go-to resource like this can cut through the noise. So, what’s trending in GenAI systems? One massive trend is efficiency. As models balloon in size, training and serving them eats up insane amounts of resources. There’s a push toward smarter ways to schedule computations, overlap communication, compress models, manage memory, optimize kernels, etc. —stuff that makes GenAI practical beyond just the big labs. Another exciting wave is the rise of systems built to support a variety of GenAI applications/tasks. This includes cool stuff like: - Reinforcement Learning from Human Feedback (RLHF): Fine-tuning models to align better with what humans want. - Multi-modal systems: Handling text, images, audio, and more. - Chat services and AI agent systems: From real-time conversations to automating complex tasks, these are stretching what LLMs can do. - Edge LLMs: Bringing these models to devices with limited and heterogeneous resources, like your phone or IoT gadgets, which could change how we use AI day-to-day. The list isn’t exhaustive, so if you’ve got papers or resources you think belong here, drop them in the comments.
-
Folks interested in AI / AI PM, I recommend watching this recent session by the awesome Aishwarya Naresh Reganti talking about Gen AI Trends. ANR is a "Top Voice" that I follow regularly, leverage her awesome GitHub repository, consume her Instagram shorts like candy and looking forward to her upcoming Maven Course on AI Engineering. https://lnkd.in/g4DiZXBU Aishwarya highlights the growing importance of prompt engineering, particularly goal engineering, where AI agents break down complex tasks into smaller steps and self-prompt to achieve higher-order goals. This trend reduces the need for users to have extensive prompt engineering skills. In the model layer, she discusses the rise of small language models (SLMs) that achieve impressive performance with less computational power, often through knowledge distillation from larger models. Multimodal foundation models are also gaining traction, with research focusing on integrating text, images, videos, and audio seamlessly. Aishwarya emphasizes Retrieval Augmented Generation (RAG) as a successful application of LLMs in the enterprise. She notes ongoing research to improve RAG's efficiency and accuracy, including better retrieval methods and noise handling. AI agents are discussed in detail, with a focus on their potential and current limitations in real-world deployments. Finally, Aishwarya provides advice for staying updated on AI research, recommending focusing on reliable sources like Hugging Face and prioritizing papers relevant to one's specific interests. She also touches upon the evolving concept of "trust scores" for AI models and the importance of actionable evaluation metrics. Key Takeaways: Goal Engineering: AI agents are learning to break down complex tasks into smaller steps, reducing the need for users to have extensive prompt engineering skills. Small Language Models (SLMs): SLMs are achieving impressive performance with less computational power, often by learning from larger models. Multimodal Foundation Models: These models are integrating text, images, videos, and audio seamlessly. Retrieval Augmented Generation (RAG): RAG is a key application of LLMs in the enterprise, with ongoing research to improve its efficiency and accuracy. AI Agents: AI agents have great potential but face limitations in real-world deployments due to challenges like novelty and evolution. Staying Updated: Focus on reliable sources like Hugging Face and prioritize papers relevant to your interests. 🤔 Trust Scores: The concept of "trust scores" for AI models is evolving, emphasizing the importance of actionable evaluation metrics. 📏 Context Length: Models can now handle much larger amounts of input text, enabling more complex tasks. 💰 Cost: The cost of using AI models is decreasing, making fine-tuning more accessible. 📚 Modularity: The trend is moving towards using multiple smaller AI models working together instead of one large model.
Generative AI in 2024 w/ Aishwarya
https://www.youtube.com/
-
O'Reilly's Technology Trends for 2025 report, published today, is based on analyzed data from 2.8 million users on its learning platform, and giving insights into the most popular technology topics consumed - identifying emerging trends that could influence business decisions in the year ahead. The outlook for AI technologies is marked by dramatic growth in key areas. The percentages describe the growth in interest or usage of specific areas within the field: Prompt Engineering surged by 456%, AI Principles by 386%, and Generative AI by 289%. Additionally, the use of GitHub Copilot skyrocketed by 471%, highlighting a robust interest in tools that boost productivity. In terms of security, there was a significant 44% increase in interest in governance, risk, and compliance, accompanied by heightened attention to application security and the zero trust model. While traditional programming languages such as Python and Java experienced declines, data engineering skills witnessed a 29% increase, underscoring their essential role in powering AI applications. * * * Based on these numbers, the report analyses the Technology Trends for 2025 in the field of AI: I. Diverse AI Models: Unlike previous years when ChatGPT dominated, the field now includes a variety of strong contenders like Claude, Google’s Gemini, and Llama. These models have broadened the AI landscape and are each finding their niches within different user bases. II. Skill Growth: There has been a significant increase in interest and development in AI skills, notably in Machine Learning, Artificial Intelligence, Natural Language Processing, Generative AI, AI Principles, and Prompt Engineering. These skills are seeing varying levels of growth, with Prompt Engineering experiencing the most substantial surge. III. Shift in Platform Focus: Interest in GPT has declined as the industry moves away from platform-specific knowledge towards more generalized, foundational AI understanding. This shift reflects a maturation in the industry as developers seek capabilities that are applicable across various models. IV. Future Trends: The report anticipates potential disillusionment with AI, a phenomenon more sociological than technical, often due to overhyped expectations. Nonetheless, advancements continue, particularly in making AI interactions more intuitive and reducing the need for complex prompts. V. Development Tools and Data Engineering: Tools like LangChain and retrieval-augmented generation (RAG) are highlighted as key to building more sophisticated AI applications that can handle private data more securely and efficiently. Moreover, the importance of data engineering skills is underscored, supporting AI applications with robust data infrastructure. * * * The insights of the report can guide strategic planning, investment decisions, and curriculum development, and overall, offer a valuable snapshot of the technology landscape.
-
Happy New Year! If you are an Enterprise CTO, you are probably thinking about your GenAI strategy. Here's a decent write-up by Gartner: https://gtnr.it/3RQodsK. To augment that, here are some #genai trends to track and act on in 2024: --Open and Smaller Models: Open models like Llama, Mistral, BERT, and FLAN are becoming competitive with larger, closed-source models. They're suitable for many use cases and offer transparency for Responsible AI. In my opinion, open & closed models are not in a zero-sum game; BOTH should be used for the right use case. Action: Implement a clear plan for using different models. Amazon Web Services (AWS) users can leverage Bedrock & SageMaker (https://bit.ly/3vkX4qa). -- Domain-Adapted Models: Use your enterprise proprietary data to extend a large language model via continued pre-training (CPT) for domain-specific tasks. Action: Assess your use cases and data for CPT alongside Fine-tuning. Learn more: https://lnkd.in/eNjbQm-m -- Multi-modal Models (MMMs): MMMs will gain prominence in 2024. Both commercial (like GPT-4V) and open-source models (like Llava) will be popular. Action: Expand into business cases served by MMMs. More about Llava: https://lnkd.in/eTvn82iM. -- AI Agents (RAG+++): AI agents using LLMs can improve upon RAG by intelligently utilizing multiple data sources. Action: Prepare APIs and Data Sources for AI agents. More information: https://lnkd.in/eVf_J_bA. -- LLMs with Graphs: Graphs are one of the best representations of real-world knowledge which when combined with LLMs can be very effective in various domains. Action: Identify suitable business cases and explore Graphs+LLMs. Details: https://lnkd.in/eF68FbVA. -- AI Routers: Most enterprises will end up using a dozen or more models and it will become necessary to manage multiple models - Auth, Audit and Smart Model Selection. Action: Build an AI router. AWS Bedrock can assist, but more is needed. Info: https://go.aws/4aFLkyA. -- FinOps meets MLOps: Focus on cost optimization for GenAI projects. 2023 was all about GenAI POCs; 2024 will be about production & big bills! Action: Learn about GenAI business cases and FinOps for GenAI: https://lnkd.in/ezfV8NTa. --Make AI Invisible. Technology is at its best when its invisible and seamless to the end user. Action: Look at existing enterprise applications and look for ways to rethink the user experience using GenAI while keeping the tech invisible. (https://bit.ly/3vkXvAO) What are you tracking? Watch out for more on domain-focused AI trends in areas like AI for the Edge, robotics, and Drug discovery etc. in upcoming posts.
-
It's December! And I was just reminiscing about all the things that happened in / defined AI in 2023, putting together a short list of keywords that were top of my mind (in no particular order). 1) LLM efficiency & adapter methods: One of the biggest research threads has been to make LLMs more efficient through various method optimizations (e.g., FlashAttention) and adapter methods (e.g., LoRA, QLoRA) and so on (probably motivated by budget, compute, and time constraints). It's one of the most exciting and refreshing developments for us practitioners. 2) A push for open source: After ChatGPT made a big impact about 1 year ago, and some of the bigger companies are making their research and models (increasingly) private, we've seen much revitalizing activity around open source. To name a few examples: - Llama 2 (still the best base model, in my opinion) - GPT4All (a nice UI to run LLMs locally) - Lit-GPT (a repo to finetuning and use various LLMs; disclaimer: I'm involved as a contributor - LlamaIndex (a toolkit for retrieval augmented generation with LLMs) - LangChain (the popular LLM API) 3) Big tech companies roll their own LLMs: Kickstarted by ChatGPT's success, every major company seems to be developing their own in-house LLM now, including Google's Bard, xAI's Grok, and Amazon's Q. 4) RLHF & DPO Finetuning: I mentioned efficiency methods for finetuning (like LoRA) above. Another trend is towards better instruction-following. We are slowly moving from supervised finetuning to reinforcement learning with human feedback (RLHF), or rather a simpler alternative: direct preference optimization (DPO). 5) Retrieval augmented generation (RAG): Many businesses are still wary of implementing pure LLM solutions. RAG solutions let them connect LLMs to existing data or knowledge bases, which may be the better option to feed LLMs new data (for now) due to reduced error, scalability, cost etc. 6) AI regulation & copyright: These are still hot, important, and largely unresolved topics. Japan had a statement this summer saying Japan's copyright laws cannot be enforced on materials and works used in datasets to train AI systems. In the US, there is no similar statement as far as I know. However, US President Biden recently issued an executive order on AI regarding the safety and security of large AI systems. 7) From text-to-image to text-to-video: 2022 was the year of text-to-image diffusion models like DALL-E 2 and Stable Diffusion. 2023 was the year of LLMs. Text-to-image models never truly went away but continued to improve. It was more likely that everyone's attention (no pun intended) was largely on LLMs. Diffusion models recently had quite the comeback, though, with the latest releases of text-to-video tools like Stable Video Diffusion or Pika 1.0. Also, so much happened on the research front! I'm excited about sitting down and compiling a list & recommendations of my favorite research papers in 2023 in a few weeks! #llms #AI #deeplearning
-
The LLMs Ecosystem Map: 2025 highlights how fast the space is moving, with companies building across multiple categories. Here’s a breakdown of some key areas and notable companies driving innovation: 1. Observability Companies like Aporia, Arize, Langfuse, Traceloop, WhyLabs, and Superwise are working on monitoring AI models to ensure performance, fairness, and explainability. 2. Orchestration & Model Deployment Platforms like Anyscale, Iguazio, Kubeflow, BentoML, Seldon, and ZenML are helping teams deploy, manage, and scale models efficiently. 3. Experiment Tracking, Prompt Engineering & Optimization Tools such as Mlflow, Comet, Neptune.ai, Agenta, and PromptLayer are enabling teams to fine-tune and optimize large language models. 4. Monitoring, Testing, or Validation Companies like Fiddler, Deepchecks, Giskard, Galileo, and AgentOps.ai are ensuring models remain accurate, unbiased, and free from failure. 5. Compliance & Risk Platforms like Deepfence, Fairnow, Lumenova, Mission Control, and Trustible are focusing on regulatory compliance, governance, and risk mitigation. 6. Model Training & Fine-Tuning Companies such as Abacus.AI, MosaicML, Predibase, Snorkel, and Scale are making model training more accessible and efficient. 7. End-to-End LLM Platforms Large platforms like AWS, Google AI, Hugging Face, Databricks, Chroma, and ChatGPT are providing full-stack AI solutions. 8. Security & Privacy With the rise of AI-driven security risks, companies like HiddenLayer, Guardrails AI, Mithril Security, Lakera, and Private AI are focusing on securing AI applications. 9. Apps & User Analytics Companies like Nebuly AI, Sentify, Autoblocks, and Context are enabling businesses to track user interactions and optimize AI applications. The trend is moving towards scalable, secure, and compliant AI systems, with an increasing emphasis on observability, privacy, and automation. As more enterprises adopt LLMs, what are the biggest challenges you see in making AI more production-ready?